In his 1987 paper (Cognitive Social Structures), Krackhardt includes
data from a 21-member network of managers (all male) at a tech
manufacturing company. These data include not only egocentric reports
(these are the people I go to for advice), but also allocentric
reports (I think John often asks Bill for advice). We can try testing
whether multistep relational abstraction explains the subjects’
allocentric beliefs in this dataset.
Setup
# Set a random seed for reproducibility
set.seed(sum(utf8ToInt("krackhardt")))
library(tidyverse)
## ── Attaching core tidyverse packages ──────────────────────────────────────────── tidyverse 2.0.0 ──
## ✔ dplyr 1.1.2 ✔ readr 2.1.4
## ✔ forcats 1.0.0 ✔ stringr 1.5.0
## ✔ ggplot2 3.4.3 ✔ tibble 3.2.1
## ✔ lubridate 1.9.2 ✔ tidyr 1.3.0
## ✔ purrr 1.0.2
## ── Conflicts ────────────────────────────────────────────────────────────── tidyverse_conflicts() ──
## ✖ dplyr::filter() masks stats::filter()
## ✖ dplyr::lag() masks stats::lag()
## ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
library(here)
## here() starts at /Users/jaeyoungson/Documents/GitHub/multistep-relational-abstraction
library(tidygraph)
##
## Attaching package: 'tidygraph'
##
## The following object is masked from 'package:stats':
##
## filter
library(ggraph)
library(patchwork)
library(lme4)
## Loading required package: Matrix
##
## Attaching package: 'Matrix'
##
## The following objects are masked from 'package:tidyr':
##
## expand, pack, unpack
library(lmerTest)
##
## Attaching package: 'lmerTest'
##
## The following object is masked from 'package:lme4':
##
## lmer
##
## The following object is masked from 'package:stats':
##
## step
library(broom.mixed)
kable <- knitr::kable
vif <- car::vif
knitting <- knitr::is_html_output()
if (knitting) {
# Create directories for saving stuff
if (!dir.exists(here("outputs"))) {
dir.create(here("outputs"))
}
if (!dir.exists(here("outputs", "07-krackhardt"))) {
dir.create(here("outputs", "07-krackhardt"))
}
}
# Pull in some modeling tools
source(here("code", "utils", "representation_utils.R"))
source(here("code", "utils", "network_utils.R"))
gg <- list(
theme_bw(),
theme(
plot.title = element_text(hjust = 0.5, size = 13),
plot.subtitle = element_text(hjust = 0.5, size = 11),
panel.grid.major = element_blank(),
panel.grid.minor = element_blank(),
panel.spacing = unit(0.75, "lines"),
legend.box.spacing = unit(0.5, "lines"),
legend.margin = margin(c(0, 0, 0, 0), unit = "lines")
)
)
plot_representation <- function(rep_df, relation_value, n_breaks = 3) {
plot_df <- rep_df %>%
filter(from != to) %>%
group_by(from, to) %>%
summarise(value = mean({{relation_value}}), .groups = "drop")
plot_df %>%
# Plot
mutate(
across(c(from, to), factor),
from = fct_rev(from)
) %>%
ggplot(aes(x=to, y=from, fill=value)) +
gg +
geom_tile() +
coord_fixed() +
scale_fill_viridis_c(
option = "magma", name = NULL, na.value = "white", n.breaks = n_breaks
) +
theme(
legend.position = "bottom",
axis.text.x = element_text(angle = 90, vjust = 0.5, hjust = 1)
)
}
check_significance <- function(tidy_stats) {
tidy_stats %>%
mutate(
significance = case_when(
p.value < 0.001 ~ "***",
p.value < 0.01 ~ "**",
p.value < 0.05 ~ "*",
p.value < 0.1 ~ ".",
TRUE ~ ""
)
)
}
These data are included in Krackhardt 1987, Appendix A. Each
allocentric CSV contains an adjacency matrix, so we’ll need to do a
little work to get them into a tidy-friendly format.
The “egocentric” network is defined in the way that social networks
are most often defined: based on each subject’s answer to the question,
“Who are you connected to?” Oftentimes, this question is phrased to
measure friendships (e.g., “Are you friends with X?”). In this
particular dataset, the question is something closer to “Who do you ask
for advice?”
The “allocentric” cognitive maps measure each subject’s answer to the
question, “Does X ask Y for advice?” This includes the special case of,
“Does X ask you for advice?” Here’s the rationale for including the
special case in allocentric maps. Imagine for a moment that this were a
study about friendships instead of advice-giving. In that case, we would
consider “Are you friends with X?” to be a question that subjects would
know the answer to (or at least, they would know better than anyone
else). On the other hand, we would consider the question “Does X
consider you a friend?” to be more of an inference than knowledge. I
don’t know whether X considers me a friend; I don’t consider X to be a
friend, but we do occasionally eat lunch together. Moreover, in cases of
disagreement (X says he asks Y for advice; Y doesn’t agree), the “ground
truth” egocentric network would then hinge on the researcher’s decision
to use an intersection vs union rule to resolve disagreements;
Krackhardt notes this in his paper. By alternatively defining the
egocentric network based solely on the responses that subjects would
have greatest introspective access to (their own advice-seeking
behavior), we sidestep this researcher degree of freedom.
krackhardt_raw <- here("data", "krackhardt") %>%
fs::dir_ls(regexp = "krackhardt_tech_managers_sub[[:digit:]]{2}\\.csv") %>%
map_dfr(
.f = ~read_csv(.x, col_names = FALSE, show_col_types = FALSE),
.id = "filename"
) %>%
mutate(
sub_id = str_extract(filename, "sub[[:digit:]]{2}"),
sub_id = str_remove(sub_id, "sub"),
sub_id = as.numeric(sub_id)
) %>%
select(-filename) %>%
group_by(sub_id) %>%
mutate(from = row_number()) %>%
ungroup() %>%
pivot_longer(
cols = -c(sub_id, from),
names_to = "to",
values_to = "edge"
) %>%
mutate(
to = str_remove(to, "X"),
to = as.integer(to)
)
krackhardt_egocentric <- krackhardt_raw %>%
filter(sub_id == from, from != to) %>%
select(-sub_id)
krackhardt_allocentric <- krackhardt_raw %>%
filter(sub_id != from, from != to)
Plot data
Okay, let’s try and get a sense for what the “egocentric” network
looks like.
krackhardt_g <- krackhardt_egocentric %>%
filter(edge == 1) %>%
select(-edge) %>%
tbl_graph(edges = .) %>%
mutate(name = row_number())
plot_network_graph <- krackhardt_g %>%
mutate(Degree = centrality_degree(mode = "in")) %>%
# Plot the "ground truth" network
ggraph(layout = "stress") +
geom_edge_link(
aes(
start_cap = label_rect(node1.name, padding = margin(15, 15, 15, 15)),
end_cap = label_rect(node2.name, padding = margin(15, 15, 15, 15)),
),
arrow = arrow(length = unit(5, "pt"), type = "closed")
) +
geom_node_label(
aes(label = name, size = Degree),
fill = "grey95", show.legend = FALSE
) +
scale_size(range = c(3, 10)) +
theme(
panel.background = element_blank(),
strip.background = element_blank(),
panel.border = element_blank(),
axis.title = element_blank(),
axis.text = element_blank(),
axis.ticks = element_blank(),
plot.title = element_text(hjust = 0.5, size = 13),
strip.text = element_text(hjust = 0.5, size = 13),
legend.position = "bottom"
)
plot_network_graph
## Warning: Using the `size` aesthetic in this geom was deprecated in ggplot2 3.4.0.
## ℹ Please use `linewidth` in the `default_aes` field and elsewhere instead.
## This warning is displayed once every 8 hours.
## Call `lifecycle::last_lifecycle_warnings()` to see where this warning was generated.

We can also plot this as a matrix instead of a graph, which will ease
comparison with the cognitive strategies we’ll consider next.
plot_network_adj <- krackhardt_egocentric %>%
plot_representation(edge, n_breaks = 2) +
ggtitle("Egocentric Network")
plot_network_adj

How does this compare to subjects’ mental representation of the
network?
plot_choice <- krackhardt_allocentric %>%
plot_representation(relation_value = edge) +
ggtitle("Network Representation")
plot_network_adj | plot_choice

In the original paper, Krackhardt makes the following note: “There
are some centers of focus for advice, notably 2 and 21 (both are vice
presidents) … In [the subjectively perceived network], 21 loses his
prominence as a major recipient of nominations: instead, 18, 14, and 7
(the president) appear to be more central.” To get a quantitative sense
for this in our data, we can try calculating the “perceived” degree
centrality by summing over the matrix columns, then comparing that
against the “true” in-degree centrality.
centrality_differences <- krackhardt_g %>%
mutate(
Degree = centrality_degree(mode = "in"),
Betweenness = centrality_betweenness(directed = TRUE)
) %>%
as_tibble() %>%
# Add represented in-degree; standardize sum by number of network members
left_join(
krackhardt_allocentric %>%
group_by(name = to) %>%
summarise(Perceived = sum(edge)/21, .groups = "drop"),
by = c("name")
) %>%
select(name, Perceived, everything()) %>%
arrange(desc(Degree), desc(Perceived))
We can see that the same number of managers go to subjects 18 and 21
for advice (15 each), but that subject 18 is perceived as being much
more “popular” (averaged across subjects, 11.95 managers are perceived
to go to him for advice) than subject 21 (averaged across subjects, 8.38
managers are perceived to go to him for advice). A similar phenomenon
happens with subjects 1 and 7, where subject 1 is perceived as being
much less popular than subject 7, despite their having the same “true”
popularity.
centrality_differences %>%
ggplot(aes(x=Degree, y=Perceived)) +
gg +
geom_smooth(method = "lm", se = FALSE) +
geom_label(aes(label = name)) +
scale_x_continuous(name = "True (degree) centrality") +
scale_y_continuous(name = "Perceived centrality")
## `geom_smooth()` using formula = 'y ~ x'

centrality_differences %>%
select(name, Degree, Perceived) %>%
kable()
| 2 |
18 |
13.047619 |
| 18 |
15 |
11.952381 |
| 21 |
15 |
8.380952 |
| 7 |
13 |
10.380952 |
| 1 |
13 |
5.095238 |
| 11 |
11 |
6.761905 |
| 14 |
10 |
11.428571 |
| 6 |
10 |
4.904762 |
| 8 |
10 |
4.666667 |
| 10 |
9 |
3.952381 |
| 17 |
9 |
3.809524 |
| 20 |
8 |
4.190476 |
| 4 |
8 |
4.000000 |
| 16 |
8 |
3.428571 |
| 12 |
7 |
3.190476 |
| 5 |
5 |
4.761905 |
| 3 |
5 |
4.476191 |
| 19 |
4 |
4.619048 |
| 15 |
4 |
2.809524 |
| 13 |
4 |
2.714286 |
| 9 |
4 |
2.619048 |
Is it possible that subjects’ network representations might be
reflecting some other centrality metric? In the original paper,
Krackhardt notes that betweenness seems to have some predictive power.
For the 18-21 and 1-7 test cases, we see that betweenness centrality
does make the “correct” prediction about which network member is
represented as being more important. But, we should note that
betweenness doesn’t do the greatest job predicting perceived popularity
across the board; its greatest predictive power is in explaining the
popularity discrepancy for the most popular managers.
centrality_differences %>%
pivot_longer(cols = Degree:Betweenness, names_to = "metric") %>%
ggplot(aes(x=value, y=Perceived)) +
gg +
facet_wrap(~metric, scales = "free_x") +
geom_smooth(method = "lm", se = FALSE) +
geom_label(aes(label = name)) +
scale_x_continuous(name = "True centrality") +
scale_y_continuous(name = "Perceived centrality")
## `geom_smooth()` using formula = 'y ~ x'

centrality_differences %>%
kable()
| 2 |
13.047619 |
18 |
5.9357143 |
| 18 |
11.952381 |
15 |
88.9166667 |
| 21 |
8.380952 |
15 |
60.1269841 |
| 7 |
10.380952 |
13 |
27.6246032 |
| 1 |
5.095238 |
13 |
13.7468254 |
| 11 |
6.761905 |
11 |
1.1984127 |
| 14 |
11.428571 |
10 |
0.5888889 |
| 6 |
4.904762 |
10 |
0.0000000 |
| 8 |
4.666667 |
10 |
3.9746032 |
| 10 |
3.952381 |
9 |
18.2968254 |
| 17 |
3.809524 |
9 |
2.5317460 |
| 20 |
4.190476 |
8 |
7.9793651 |
| 4 |
4.000000 |
8 |
13.7087302 |
| 16 |
3.428571 |
8 |
0.7000000 |
| 12 |
3.190476 |
7 |
0.2539683 |
| 5 |
4.761905 |
5 |
5.0785714 |
| 3 |
4.476191 |
5 |
6.6047619 |
| 19 |
4.619048 |
4 |
0.7539683 |
| 15 |
2.809524 |
4 |
6.1325397 |
| 13 |
2.714286 |
4 |
0.8928571 |
| 9 |
2.619048 |
4 |
3.9539683 |
Simulated observations
We know that an individual’s embedding within a network dictates what
interactions and/or relationships they’re able to observe. Therefore,
when building predicted cognitive maps, we should use a different set of
interactions to inform each subject’s predicted representations. Of
course, we have no idea what individual subjects’ observations actually
are. So we’ll need to come up with a compact and reasonable set of
assumptions about what observations were most likely available to
individual network members.
Here are three key assumptions we’ll make here:
- We’ll assume (trivially) that people can observe interactions
between themselves and their immediate advisors (and/or advisees).
- We’ll also assume that people are generally able to observe
interactions between advisors-of-advisors (and/or advisees-of-advisees),
either through firsthand observation or secondhand chatter.
- Though of course it’s plausible that people might observe more
distant interactions, we’ll assume this happens with enough infrequency
that we can ignore those interactions for the sake of modeling.
We’ve left open an ambiguity in those assumptions: are people
observing advisors-of-advisors, and/or advisees-of-advisees? To aid
understanding, we’ll use the following example. Suppose we’re looking at
a subset of a network consisting of the grad student Abby, who asks the
postdoc Betty for advice; in turn, Betty the postdoc asks Cathy the
assistant professor for advice; and Cathy asks Dani the associate
professor for advice. There are three reasonable ways that observations
might work in this network:
- Observations go “out”: When Abby asks Betty for advice, Betty
responds by sharing the wisdom she’s gained from asking Cathy for
advice. That leads Abby to observe that Betty goes to Cathy for
advice.
- Observations go “in”: Betty has asked Cathy for advice. Cathy thinks
Betty has asked a great question that she doesn’t know the answer to, so
Cathy asks Dani about it. That leads Dani to observe that Betty goes to
Cathy for advice.
- It goes both ways.
We may be able to test this empirically. We would need to simulate a
set of observations consistent with all three possibilities, generate
model predictions from those simulations, then compare how well those
models fit managers’ actual perceptions of the network.
# Observations go both ways
obs_edgelist_all <- expand_grid(
sub_id = 1:21,
# Start with all one-step egocentric observations
krackhardt_egocentric
) %>%
# Convert adjlist into edgelist
filter(edge == 1) %>%
select(-edge) %>%
# For a given node, we want to know who their "advisors-of-advisors" are
# (or equivalently, "advisees-of-advisees")
left_join(
krackhardt_g %>%
mutate(neighbors = local_members(order = 1, mode = "all")) %>%
as_tibble(),
by = c("sub_id" = "name")
) %>%
# We assume that nodes are able to observe (directly or through chatter)
# advice-giving interactions from their advisors and advisors-of-advisors
# (or equivalently, advisees and advisees-of-advisees)
rowwise() %>%
filter((from %in% neighbors) | (to %in% neighbors)) %>%
ungroup() %>%
select(-neighbors) %>%
# Make pretty
arrange(sub_id, from)
# Advisees observe who advises their advisors
obs_edgelist_out <- expand_grid(
sub_id = 1:21,
krackhardt_egocentric
) %>%
filter(edge == 1) %>%
select(-edge) %>%
left_join(
krackhardt_g %>%
mutate(neighbors = local_members(order = 1, mode = "out")) %>%
as_tibble(),
by = c("sub_id" = "name")
) %>%
rowwise() %>%
filter((from %in% neighbors) | (to %in% neighbors)) %>%
ungroup() %>%
select(-neighbors) %>%
arrange(sub_id, from)
# Advisors observe who their advisees advise
obs_edgelist_in <- expand_grid(
sub_id = 1:21,
krackhardt_egocentric
) %>%
filter(edge == 1) %>%
select(-edge) %>%
left_join(
krackhardt_g %>%
mutate(neighbors = local_members(order = 1, mode = "in")) %>%
as_tibble(),
by = c("sub_id" = "name")
) %>%
rowwise() %>%
filter((from %in% neighbors) | (to %in% neighbors)) %>%
ungroup() %>%
select(-neighbors) %>%
arrange(sub_id, from)
obs_adjlist_all <- expand_grid(sub_id = 1:21, from = sub_id, to = sub_id) %>%
left_join(
obs_edgelist_all %>% mutate(edge = 1),
by = c("sub_id", "from", "to")
) %>%
mutate(edge = if_else(is.na(edge), 0, edge))
obs_adjlist_out <- expand_grid(sub_id = 1:21, from = sub_id, to = sub_id) %>%
left_join(
obs_edgelist_out %>% mutate(edge = 1),
by = c("sub_id", "from", "to")
) %>%
mutate(edge = if_else(is.na(edge), 0, edge))
obs_adjlist_in <- expand_grid(sub_id = 1:21, from = sub_id, to = sub_id) %>%
left_join(
obs_edgelist_in %>% mutate(edge = 1),
by = c("sub_id", "from", "to")
) %>%
mutate(edge = if_else(is.na(edge), 0, edge))
Schema-based representation
If people were using triad or community schemas/heuristics to inform
representation, what would this look like? Using literally the exact
same tools that we used to analyze the butterfly network, let’s create
some predicted representations.
schema_experts_all <- obs_adjlist_all %>%
# Memorization
group_by(sub_id, from) %>%
mutate(
mem_value = edge / sum(edge),
mem_value = if_else(is.nan(mem_value), 0, mem_value)
) %>%
ungroup() %>%
# Now triads
left_join(
obs_adjlist_all %>%
group_by(sub_id) %>%
nest() %>%
mutate(triad = map(data, ~build_rep_triad(.x, "edge"))) %>%
unnest(triad) %>%
ungroup() %>%
select(-c(data, transition_scaling)) %>%
mutate(triad_value = if_else(is.nan(triad_value), 0, triad_value)),
by = c("sub_id", "from", "to")
) %>%
# Finally communities
left_join(
obs_adjlist_all %>%
group_by(sub_id) %>%
nest() %>%
mutate(comm = map(data, ~build_rep_community(.x, "edge"))) %>%
unnest(comm) %>%
ungroup() %>%
select(-c(data, transition_scaling)),
by = c("sub_id", "from", "to")
)
schema_experts_out <- obs_adjlist_out %>%
# Memorization
group_by(sub_id, from) %>%
mutate(
mem_value = edge / sum(edge),
mem_value = if_else(is.nan(mem_value), 0, mem_value)
) %>%
ungroup() %>%
# Now triads
left_join(
obs_adjlist_out %>%
group_by(sub_id) %>%
nest() %>%
mutate(triad = map(data, ~build_rep_triad(.x, "edge"))) %>%
unnest(triad) %>%
ungroup() %>%
select(-c(data, transition_scaling)) %>%
mutate(triad_value = if_else(is.nan(triad_value), 0, triad_value)),
by = c("sub_id", "from", "to")
) %>%
# Finally communities
left_join(
obs_adjlist_out %>%
group_by(sub_id) %>%
nest() %>%
mutate(comm = map(data, ~build_rep_community(.x, "edge"))) %>%
unnest(comm) %>%
ungroup() %>%
select(-c(data, transition_scaling)),
by = c("sub_id", "from", "to")
)
schema_experts_in <- obs_adjlist_in %>%
# Memorization
group_by(sub_id, from) %>%
mutate(
mem_value = edge / sum(edge),
mem_value = if_else(is.nan(mem_value), 0, mem_value)
) %>%
ungroup() %>%
# Now triads
left_join(
obs_adjlist_in %>%
group_by(sub_id) %>%
nest() %>%
mutate(triad = map(data, ~build_rep_triad(.x, "edge"))) %>%
unnest(triad) %>%
ungroup() %>%
select(-c(data, transition_scaling)) %>%
mutate(triad_value = if_else(is.nan(triad_value), 0, triad_value)),
by = c("sub_id", "from", "to")
) %>%
# Finally communities
left_join(
obs_adjlist_in %>%
group_by(sub_id) %>%
nest() %>%
mutate(comm = map(data, ~build_rep_community(.x, "edge"))) %>%
unnest(comm) %>%
ungroup() %>%
select(-c(data, transition_scaling)),
by = c("sub_id", "from", "to")
)
plot_mem_all <- plot_representation(schema_experts_all, mem_value) +
ggtitle("Memorization")
plot_triad_all <- plot_representation(schema_experts_all, triad_value) +
ggtitle("Triad Completion")
plot_comm_all <- plot_representation(schema_experts_all, comm_value) +
ggtitle("Community Detection")
plot_schemas_all <- (
plot_choice | plot_mem_all | plot_triad_all | plot_comm_all
) +
plot_annotation(
title = "Observations of advice-giving go both ways",
theme = theme(plot.title = element_text(size = 15, hjust = 0.5))
)
plot_schemas_all

if (knitting) {
ggsave(
here("outputs", "07-krackhardt", "schema_gallery_all.pdf"),
plot_schemas_all,
width = 12, height = 4,
device = cairo_pdf, units = "in", dpi = 300
)
}
plot_mem_out <- plot_representation(schema_experts_out, mem_value) +
ggtitle("Memorization")
plot_triad_out <- plot_representation(schema_experts_out, triad_value) +
ggtitle("Triad Completion")
plot_comm_out <- plot_representation(schema_experts_out, comm_value) +
ggtitle("Community Detection")
plot_schemas_out <- (
plot_choice | plot_mem_out | plot_triad_out | plot_comm_out
) +
plot_annotation(
title = "Advisees observe advisors' advisors",
theme = theme(plot.title = element_text(size = 15, hjust = 0.5))
)
plot_schemas_out

if (knitting) {
ggsave(
here("outputs", "07-krackhardt", "schema_gallery_out.pdf"),
plot_schemas_out,
width = 12, height = 4,
device = cairo_pdf, units = "in", dpi = 300
)
}
plot_mem_in <- plot_representation(schema_experts_in, mem_value) +
ggtitle("Memorization")
plot_triad_in <- plot_representation(schema_experts_in, triad_value) +
ggtitle("Triad Completion")
plot_comm_in <- plot_representation(schema_experts_in, comm_value) +
ggtitle("Community Detection")
plot_schemas_in <- (
plot_choice | plot_mem_in | plot_triad_in | plot_comm_in
) +
plot_annotation(
title = "Advisors observe advisees' advisees",
theme = theme(plot.title = element_text(size = 15, hjust = 0.5))
)
plot_schemas_in

if (knitting) {
ggsave(
here("outputs", "07-krackhardt", "schema_gallery_in.pdf"),
plot_schemas_in,
width = 12, height = 4,
device = cairo_pdf, units = "in", dpi = 300
)
}
At the group-level… schemas make frankly terrible predictions, no
matter what we assume about observations. Note that if a particular
subject is predicted to ask all other managers for advice, the row
associated with that subject will be filled with a value close to
1/20=0.05. So we can see that the triad completion strategy predicts
that most managers are connected, and the community detection strategy
predicts that literally all managers are connected. Neither comes even a
little bit close to the actual allocentric network representation.
Successor repesentation
Let’s say people use multistep relational abstraction to represent
networks, and let’s use the SR to approximate that kind of abstraction.
Using the observations we generated before, let’s create lots of
learning “trials” to train the algorithm.
sr_observations_all <- obs_edgelist_all %>%
# Create simulated "learning trials" to train an asymptotic SR
expand_grid(rep = 1:100) %>%
# Randomize "order of trials" per repetition
group_by(sub_id, rep) %>%
slice_sample(prop = 1) %>%
ungroup()
sr_observations_out <- obs_edgelist_out %>%
expand_grid(rep = 1:100) %>%
group_by(sub_id, rep) %>%
slice_sample(prop = 1) %>%
ungroup()
sr_observations_in <- obs_edgelist_in %>%
expand_grid(rep = 1:100) %>%
group_by(sub_id, rep) %>%
slice_sample(prop = 1) %>%
ungroup()
And now we can use those to build SRs at different scales. We’ll
assume that every subject is using the same successor horizon…
sr_all <- sr_observations_all %>%
expand_grid(gamma = seq(0.1, 0.9, 0.1), .) %>%
group_by(sub_id, gamma) %>%
nest() %>%
mutate(
sr = map2(
.x = data, .y = gamma,
~build_rep_sr(.x, this_alpha = 0.1, this_gamma = .y)
)
) %>%
unnest(sr) %>%
ungroup() %>%
select(-data) %>%
mutate(gamma = gamma * 10) %>%
pivot_wider(
names_from = gamma,
names_prefix = "sr",
values_from = sr_value
)
sr_out <- sr_observations_out %>%
expand_grid(gamma = seq(0.1, 0.9, 0.1), .) %>%
group_by(sub_id, gamma) %>%
nest() %>%
mutate(
sr = map2(
.x = data, .y = gamma,
~build_rep_sr(.x, this_alpha = 0.1, this_gamma = .y)
)
) %>%
unnest(sr) %>%
ungroup() %>%
select(-data) %>%
mutate(gamma = gamma * 10) %>%
pivot_wider(
names_from = gamma,
names_prefix = "sr",
values_from = sr_value
)
sr_in <- sr_observations_in %>%
expand_grid(gamma = seq(0.1, 0.9, 0.1), .) %>%
group_by(sub_id, gamma) %>%
nest() %>%
mutate(
sr = map2(
.x = data, .y = gamma,
~build_rep_sr(.x, this_alpha = 0.1, this_gamma = .y)
)
) %>%
unnest(sr) %>%
ungroup() %>%
select(-data) %>%
mutate(gamma = gamma * 10) %>%
pivot_wider(
names_from = gamma,
names_prefix = "sr",
values_from = sr_value
)
Let’s get a sense for the predictions when the SR is trained on
observations that go both ways.
plot_sr1_all <- sr_all %>%
plot_representation(relation_value = sr1, n_breaks = 4) +
ggtitle("\u03B3 = 0.1")
plot_sr3_all <- sr_all %>%
plot_representation(relation_value = sr3) +
ggtitle("\u03B3 = 0.3")
plot_sr5_all <- sr_all %>%
plot_representation(relation_value = sr5) +
ggtitle("\u03B3 = 0.5")
plot_sr7_all <- sr_all %>%
plot_representation(relation_value = sr7) +
ggtitle("\u03B3 = 0.7")
plot_sr9_all <- sr_all %>%
plot_representation(relation_value = sr9) +
ggtitle("\u03B3 = 0.9")
plot_sr_gallery_all <- (
plot_choice +
plot_sr1_all + plot_sr3_all + plot_sr5_all + plot_sr7_all + plot_sr9_all
)
plot_sr_gallery_all

if (knitting) {
ggsave(
here("outputs", "07-krackhardt", "sr_gallery_all.pdf"),
plot_sr_gallery_all,
width = 12, height = 8,
device = cairo_pdf, units = "in", dpi = 300
)
}
And now where advisees observe their advisors’ advisors…
plot_sr1_out <- sr_out %>%
plot_representation(relation_value = sr1, n_breaks = 4) +
ggtitle("\u03B3 = 0.1")
plot_sr3_out <- sr_out %>%
plot_representation(relation_value = sr3) +
ggtitle("\u03B3 = 0.3")
plot_sr5_out <- sr_out %>%
plot_representation(relation_value = sr5) +
ggtitle("\u03B3 = 0.5")
plot_sr7_out <- sr_out %>%
plot_representation(relation_value = sr7) +
ggtitle("\u03B3 = 0.7")
plot_sr9_out <- sr_out %>%
plot_representation(relation_value = sr9) +
ggtitle("\u03B3 = 0.9")
plot_sr_gallery_out <- (
plot_choice +
plot_sr1_out + plot_sr3_out + plot_sr5_out + plot_sr7_out + plot_sr9_out
)
plot_sr_gallery_out

if (knitting) {
ggsave(
here("outputs", "07-krackhardt", "sr_gallery_out.pdf"),
plot_sr_gallery_out,
width = 12, height = 8,
device = cairo_pdf, units = "in", dpi = 300
)
}
And finally where advisors observe their advisees’ advisees…
plot_sr1_in <- sr_in %>%
plot_representation(relation_value = sr1, n_breaks = 4) +
ggtitle("\u03B3 = 0.1")
plot_sr3_in <- sr_in %>%
plot_representation(relation_value = sr3) +
ggtitle("\u03B3 = 0.3")
plot_sr5_in <- sr_in %>%
plot_representation(relation_value = sr5) +
ggtitle("\u03B3 = 0.5")
plot_sr7_in <- sr_in %>%
plot_representation(relation_value = sr7) +
ggtitle("\u03B3 = 0.7")
plot_sr9_in <- sr_in %>%
plot_representation(relation_value = sr9) +
ggtitle("\u03B3 = 0.9")
plot_sr_gallery_in <- (
plot_choice +
plot_sr1_in + plot_sr3_in + plot_sr5_in + plot_sr7_in + plot_sr9_in
)
plot_sr_gallery_in

if (knitting) {
ggsave(
here("outputs", "07-krackhardt", "sr_gallery_in.pdf"),
plot_sr_gallery_in,
width = 12, height = 8,
device = cairo_pdf, units = "in", dpi = 300
)
}
Statistical tests
Okay, now that we have all of the predicted strategies in hand, we’ll
want to perform some formal statistical tests to verify that the SR is
truly doing the best at explaining subjects’ network representations.
Let’s create a dataframe with all of the info we’ll need.
test_dataset_all <- krackhardt_allocentric %>%
rename(choice = edge) %>%
left_join(schema_experts_all, by = c("sub_id", "from", "to")) %>%
left_join(sr_all, by = c("sub_id", "from", "to"))
test_dataset_out <- krackhardt_allocentric %>%
rename(choice = edge) %>%
left_join(schema_experts_out, by = c("sub_id", "from", "to")) %>%
left_join(sr_out, by = c("sub_id", "from", "to"))
test_dataset_in <- krackhardt_allocentric %>%
rename(choice = edge) %>%
left_join(schema_experts_in, by = c("sub_id", "from", "to")) %>%
left_join(sr_in, by = c("sub_id", "from", "to"))
We’ll do all of the statistical tests for the “observations go both
ways” models. Note that we can’t actually estimate a model for community
detection, as this strategy predicts that everyone is connected to
everyone.
test_mem_all <- test_dataset_all %>%
glmer(
choice ~ mem_value + (1 + mem_value | sub_id),
family = "binomial", data = .
)
test_triad_all <- test_dataset_all %>%
glmer(
choice ~ mem_value + triad_value + (1 + triad_value | sub_id),
family = "binomial", data = .
)
# test_comm_all <- test_dataset_all %>%
# glmer(
# choice ~ mem_value + comm_value + (1 + comm_value | sub_id),
# family = "binomial", data = .
# )
test_sr1_all <- test_dataset_all %>%
glmer(
choice ~ mem_value + sr1 + (1 + sr1 | sub_id),
family = "binomial", data = .
)
test_sr2_all <- test_dataset_all %>%
glmer(
choice ~ mem_value + sr2 + (1 + sr2 | sub_id),
family = "binomial", data = .
)
test_sr3_all <- test_dataset_all %>%
glmer(
choice ~ mem_value + sr3 + (1 + sr3 | sub_id),
family = "binomial", data = .
)
test_sr4_all <- test_dataset_all %>%
glmer(
choice ~ mem_value + sr4 + (1 + sr4 | sub_id),
family = "binomial", data = .
)
test_sr5_all <- test_dataset_all %>%
glmer(
choice ~ mem_value + sr5 + (1 + sr5 | sub_id),
family = "binomial", data = .
)
test_sr6_all <- test_dataset_all %>%
glmer(
choice ~ mem_value + sr6 + (1 + sr6 | sub_id),
family = "binomial", data = .
)
test_sr7_all <- test_dataset_all %>%
glmer(
choice ~ mem_value + sr7 + (1 + sr7 | sub_id),
family = "binomial", data = .
)
test_sr8_all <- test_dataset_all %>%
glmer(
choice ~ mem_value + sr8 + (1 + sr8 | sub_id),
family = "binomial", data = .
)
test_sr9_all <- test_dataset_all %>%
glmer(
choice ~ mem_value + sr9 + (1 + sr9 | sub_id),
family = "binomial", data = .
)
Moving on to the “advisees observe advisors’ advisors” models.
test_mem_out <- test_dataset_out %>%
glmer(
choice ~ mem_value + (1 + mem_value | sub_id),
family = "binomial", data = .
)
test_triad_out <- test_dataset_out %>%
glmer(
choice ~ mem_value + triad_value + (1 + triad_value | sub_id),
family = "binomial", data = .
)
test_comm_out <- test_dataset_out %>%
glmer(
choice ~ mem_value + comm_value + (1 + comm_value | sub_id),
family = "binomial", data = .
)
test_sr1_out <- test_dataset_out %>%
glmer(
choice ~ mem_value + sr1 + (1 + sr1 | sub_id),
family = "binomial", data = .
)
test_sr2_out <- test_dataset_out %>%
glmer(
choice ~ mem_value + sr2 + (1 + sr2 | sub_id),
family = "binomial", data = .
)
test_sr3_out <- test_dataset_out %>%
glmer(
choice ~ mem_value + sr3 + (1 + sr3 | sub_id),
family = "binomial", data = .
)
test_sr4_out <- test_dataset_out %>%
glmer(
choice ~ mem_value + sr4 + (1 + sr4 | sub_id),
family = "binomial", data = .
)
test_sr5_out <- test_dataset_out %>%
glmer(
choice ~ mem_value + sr5 + (1 + sr5 | sub_id),
family = "binomial", data = .
)
test_sr6_out <- test_dataset_out %>%
glmer(
choice ~ mem_value + sr6 + (1 + sr6 | sub_id),
family = "binomial", data = .
)
test_sr7_out <- test_dataset_out %>%
glmer(
choice ~ mem_value + sr7 + (1 + sr7 | sub_id),
family = "binomial", data = .
)
test_sr8_out <- test_dataset_out %>%
glmer(
choice ~ mem_value + sr8 + (1 + sr8 | sub_id),
family = "binomial", data = .
)
test_sr9_out <- test_dataset_out %>%
glmer(
choice ~ mem_value + sr9 + (1 + sr9 | sub_id),
family = "binomial", data = .
)
Finally, the “advisors observe advisees’ advisees” models. Once
again, we can’t estimate a model for community detection, as this
strategy predicts that everyone is connected to everyone.
test_mem_in <- test_dataset_in %>%
glmer(
choice ~ mem_value + (1 + mem_value | sub_id),
family = "binomial", data = .
)
test_triad_in <- test_dataset_in %>%
glmer(
choice ~ mem_value + triad_value + (1 + triad_value | sub_id),
family = "binomial", data = .
)
# test_comm_in <- test_dataset_in %>%
# glmer(
# choice ~ mem_value + comm_value + (1 + comm_value | sub_id),
# family = "binomial", data = .
# )
test_sr1_in <- test_dataset_in %>%
glmer(
choice ~ mem_value + sr1 + (1 + sr1 | sub_id),
family = "binomial", data = .
)
test_sr2_in <- test_dataset_in %>%
glmer(
choice ~ mem_value + sr2 + (1 + sr2 | sub_id),
family = "binomial", data = .
)
test_sr3_in <- test_dataset_in %>%
glmer(
choice ~ mem_value + sr3 + (1 + sr3 | sub_id),
family = "binomial", data = .
)
test_sr4_in <- test_dataset_in %>%
glmer(
choice ~ mem_value + sr4 + (1 + sr4 | sub_id),
family = "binomial", data = .
)
test_sr5_in <- test_dataset_in %>%
glmer(
choice ~ mem_value + sr5 + (1 + sr5 | sub_id),
family = "binomial", data = .
)
test_sr6_in <- test_dataset_in %>%
glmer(
choice ~ mem_value + sr6 + (1 + sr6 | sub_id),
family = "binomial", data = .
)
test_sr7_in <- test_dataset_in %>%
glmer(
choice ~ mem_value + sr7 + (1 + sr7 | sub_id),
family = "binomial", data = .
)
test_sr8_in <- test_dataset_in %>%
glmer(
choice ~ mem_value + sr8 + (1 + sr8 | sub_id),
family = "binomial", data = .
)
test_sr9_in <- test_dataset_in %>%
glmer(
choice ~ mem_value + sr9 + (1 + sr9 | sub_id),
family = "binomial", data = .
)
Of all of these models, which has the “best” fit, based on BIC?
# Which has the best "goodness-of-fit"?
goodness_of_fit <- bind_rows(
# All: goes both ways
glance(test_mem_all) %>% mutate(model = "mem_all"),
glance(test_triad_all) %>% mutate(model = "triad_all"),
glance(test_sr1_all) %>% mutate(model = "sr1_all"),
glance(test_sr2_all) %>% mutate(model = "sr2_all"),
glance(test_sr3_all) %>% mutate(model = "sr3_all"),
glance(test_sr4_all) %>% mutate(model = "sr4_all"),
glance(test_sr5_all) %>% mutate(model = "sr5_all"),
glance(test_sr6_all) %>% mutate(model = "sr6_all"),
glance(test_sr7_all) %>% mutate(model = "sr7_all"),
glance(test_sr8_all) %>% mutate(model = "sr8_all"),
glance(test_sr9_all) %>% mutate(model = "sr9_all"),
# Out: advisors of advisors
glance(test_mem_out) %>% mutate(model = "mem_out"),
glance(test_triad_out) %>% mutate(model = "triad_out"),
glance(test_comm_out) %>% mutate(model = "comm_out"),
glance(test_sr1_out) %>% mutate(model = "sr1_out"),
glance(test_sr2_out) %>% mutate(model = "sr2_out"),
glance(test_sr3_out) %>% mutate(model = "sr3_out"),
glance(test_sr4_out) %>% mutate(model = "sr4_out"),
glance(test_sr5_out) %>% mutate(model = "sr5_out"),
glance(test_sr6_out) %>% mutate(model = "sr6_out"),
glance(test_sr7_out) %>% mutate(model = "sr7_out"),
glance(test_sr8_out) %>% mutate(model = "sr8_out"),
glance(test_sr9_out) %>% mutate(model = "sr9_out"),
# In: advisees of advisees
glance(test_mem_in) %>% mutate(model = "mem_in"),
glance(test_triad_in) %>% mutate(model = "triad_in"),
glance(test_sr1_in) %>% mutate(model = "sr1_in"),
glance(test_sr2_in) %>% mutate(model = "sr2_in"),
glance(test_sr3_in) %>% mutate(model = "sr3_in"),
glance(test_sr4_in) %>% mutate(model = "sr4_in"),
glance(test_sr5_in) %>% mutate(model = "sr5_in"),
glance(test_sr6_in) %>% mutate(model = "sr6_in"),
glance(test_sr7_in) %>% mutate(model = "sr7_in"),
glance(test_sr8_in) %>% mutate(model = "sr8_in"),
glance(test_sr9_in) %>% mutate(model = "sr9_in"),
) %>%
select(model, everything()) %>%
arrange(BIC) %>%
# Make this a little more human-readable
separate(model, into = c("model", "direction")) %>%
mutate(
direction = case_when(
direction == "out" ~ "advisee observes advisor of advisor",
direction == "in" ~ "advisor observes advisee of advisee",
direction == "all" ~ "both ways"
)
)
goodness_of_fit %>%
mutate(
model = fct_relevel(
model,
"sr1", "sr2", "sr3", "sr4", "sr5", "sr6", "sr7", "sr8", "sr9",
"mem", "triad", "comm"
)
) %>%
ggplot(aes(x = model, y = BIC, color = direction)) +
gg +
geom_point() +
ggtitle("Model goodness-of-fit") +
theme(legend.position = "bottom")

goodness_of_fit %>%
kable()
| sr7 |
advisee observes advisor of advisor |
8400 |
1 |
-4363.029 |
8738.058 |
8780.274 |
8565.413 |
8394 |
| sr8 |
advisee observes advisor of advisor |
8400 |
1 |
-4363.853 |
8739.706 |
8781.922 |
8566.508 |
8394 |
| sr6 |
advisee observes advisor of advisor |
8400 |
1 |
-4369.466 |
8750.931 |
8793.147 |
8580.404 |
8394 |
| sr9 |
advisee observes advisor of advisor |
8400 |
1 |
-4375.204 |
8762.408 |
8804.624 |
8592.060 |
8394 |
| sr5 |
advisee observes advisor of advisor |
8400 |
1 |
-4379.140 |
8770.281 |
8812.497 |
8602.266 |
8394 |
| sr4 |
advisee observes advisor of advisor |
8400 |
1 |
-4389.390 |
8790.781 |
8832.996 |
8625.097 |
8394 |
| sr8 |
both ways |
8400 |
1 |
-4398.735 |
8809.469 |
8851.685 |
8657.019 |
8394 |
| sr7 |
both ways |
8400 |
1 |
-4398.873 |
8809.746 |
8851.962 |
8657.895 |
8394 |
| sr3 |
advisee observes advisor of advisor |
8400 |
1 |
-4398.903 |
8809.805 |
8852.021 |
8646.054 |
8394 |
| sr6 |
both ways |
8400 |
1 |
-4403.292 |
8818.583 |
8860.799 |
8668.298 |
8394 |
| sr9 |
both ways |
8400 |
1 |
-4406.178 |
8824.357 |
8866.573 |
8673.713 |
8394 |
| sr2 |
advisee observes advisor of advisor |
8400 |
1 |
-4407.193 |
8826.387 |
8868.603 |
8664.134 |
8394 |
| sr5 |
both ways |
8400 |
1 |
-4409.244 |
8830.487 |
8872.703 |
8681.930 |
8394 |
| sr1 |
advisee observes advisor of advisor |
8400 |
1 |
-4414.189 |
8840.378 |
8882.594 |
8679.243 |
8394 |
| sr4 |
both ways |
8400 |
1 |
-4415.308 |
8842.615 |
8884.831 |
8695.618 |
8394 |
| sr3 |
both ways |
8400 |
1 |
-4420.912 |
8853.824 |
8896.040 |
8708.120 |
8394 |
| triad |
advisee observes advisor of advisor |
8400 |
1 |
-4422.288 |
8856.576 |
8898.791 |
8721.435 |
8394 |
| sr2 |
both ways |
8400 |
1 |
-4425.886 |
8863.773 |
8905.989 |
8719.092 |
8394 |
| sr1 |
both ways |
8400 |
1 |
-4430.221 |
8872.442 |
8914.658 |
8728.545 |
8394 |
| triad |
both ways |
8400 |
1 |
-4431.553 |
8875.106 |
8917.322 |
8740.564 |
8394 |
| mem |
advisee observes advisor of advisor |
8400 |
1 |
-4444.995 |
8899.991 |
8935.171 |
8763.004 |
8395 |
| mem |
both ways |
8400 |
1 |
-4452.356 |
8914.713 |
8949.893 |
8775.496 |
8395 |
| comm |
advisee observes advisor of advisor |
8400 |
1 |
-4461.317 |
8934.634 |
8976.850 |
8821.487 |
8394 |
| mem |
advisor observes advisee of advisee |
8400 |
1 |
-4590.630 |
9191.259 |
9226.439 |
9024.731 |
8395 |
| sr7 |
advisor observes advisee of advisee |
8400 |
1 |
-4591.253 |
9194.506 |
9236.722 |
9021.134 |
8394 |
| triad |
advisor observes advisee of advisee |
8400 |
1 |
-4591.808 |
9195.615 |
9237.831 |
9044.740 |
8394 |
| sr6 |
advisor observes advisee of advisee |
8400 |
1 |
-4591.998 |
9195.997 |
9238.213 |
9024.658 |
8394 |
| sr8 |
advisor observes advisee of advisee |
8400 |
1 |
-4593.417 |
9198.834 |
9241.050 |
9024.487 |
8394 |
| sr5 |
advisor observes advisee of advisee |
8400 |
1 |
-4594.073 |
9200.146 |
9242.362 |
9031.037 |
8394 |
| sr4 |
advisor observes advisee of advisee |
8400 |
1 |
-4596.625 |
9205.251 |
9247.467 |
9038.234 |
8394 |
| sr3 |
advisor observes advisee of advisee |
8400 |
1 |
-4599.254 |
9210.508 |
9252.724 |
9045.352 |
8394 |
| sr9 |
advisor observes advisee of advisee |
8400 |
1 |
-4600.464 |
9212.929 |
9255.145 |
9040.442 |
8394 |
| sr2 |
advisor observes advisee of advisee |
8400 |
1 |
-4601.778 |
9215.556 |
9257.772 |
9052.026 |
8394 |
| sr1 |
advisor observes advisee of advisee |
8400 |
1 |
-4604.117 |
9220.234 |
9262.450 |
9058.124 |
8394 |
Let’s take a look at the best-fitting model estimates:
test_sr7_out %>%
tidy(conf.int = TRUE) %>%
kable()
| fixed |
NA |
(Intercept) |
-1.8725110 |
0.1852784 |
-10.106474 |
0 |
-2.235650 |
-1.509372 |
| fixed |
NA |
mem_value |
3.5286019 |
0.4025315 |
8.766027 |
0 |
2.739655 |
4.317549 |
| fixed |
NA |
sr7 |
14.7453961 |
2.3390595 |
6.303985 |
0 |
10.160924 |
19.329868 |
| ran_pars |
sub_id |
sd__(Intercept) |
0.8095989 |
NA |
NA |
NA |
NA |
NA |
| ran_pars |
sub_id |
cor__(Intercept).sr7 |
-0.2827640 |
NA |
NA |
NA |
NA |
NA |
| ran_pars |
sub_id |
sd__sr7 |
9.5523177 |
NA |
NA |
NA |
NA |
NA |
vif(test_sr7_out)
## mem_value sr7
## 1.081353 1.081353
SR centrality
Can the best-fitting SR explain an individual’s perceptions of other
network members’ centrality?
sr_centrality <- sr_out %>%
# SR centrality from best variant
filter(from != to) %>%
group_by(sub_id, node_id = to) %>%
summarise(sr_centrality = mean(sr7), .groups = "drop") %>%
# Perceived centrality
left_join(
krackhardt_allocentric %>%
group_by(sub_id, node_id = to) %>%
summarise(perceived_centrality = mean(edge), .groups = "drop"),
by = c("sub_id", "node_id")
)
To answer this, we’ll do a random-effects analysis where we take each
individual’s idiosyncratic mental representation of their network, then
rank-correlate it with the SR’s individual-level predictions.
sr_centrality_corr <- sr_centrality %>%
group_by(sub_id) %>%
nest() %>%
mutate(
spearman = map_dbl(
.x = data,
.f = ~with(
.x, cor(sr_centrality, perceived_centrality, method = "spearman")
)
)
) %>%
ungroup() %>%
select(-data)
sr_centrality_corr %>%
mutate(spearman = atanh(spearman)) %>%
select(spearman) %>%
deframe() %>%
t.test() %>%
tidy() %>%
kable(
caption = str_c(
"t-test of the average Spearman correlation between perceived and ",
"SR-predicted centrality (z-transformed)"
)
)
t-test of the average Spearman correlation between perceived
and SR-predicted centrality (z-transformed)
| 0.472048 |
10.55384 |
0 |
20 |
0.3787478 |
0.5653482 |
One Sample t-test |
two.sided |
sr_centrality_corr %>%
summarise(mean_spearman = mean(spearman)) %>%
kable()
Let’s try to visualize these results in an intuitive format.
plot_sr_centrality_group <- sr_centrality %>%
# Get each subject's rank-ordered perceived and SR-predicted centrality
group_by(sub_id) %>%
mutate(across(ends_with("_centrality"), ~rank(.x, ties.method = "min"))) %>%
ungroup() %>%
# Average across subjects to get group-level results
group_by(node_id) %>%
summarise(across(ends_with("_centrality"), mean)) %>%
# For color-coding two pairs of managers with equivalent degree centrality,
# but different perceived centrality
mutate(
colorcode = case_when(
node_id %in% c(1, 7) ~ "a",
node_id %in% c(18, 21) ~ "b"
)
) %>%
ggplot(aes(x=perceived_centrality, y=sr_centrality)) +
gg +
geom_smooth(method = "lm", se = FALSE, color = "black") +
geom_label(aes(label = node_id, fill = colorcode), show.legend = FALSE) +
scale_fill_manual(values = c("#a6cee3", "#b2df8a"), na.value = "grey95") +
scale_x_continuous(name = "Perceived centrality (rank-ordered)") +
scale_y_continuous(name = "SR centrality (rank-ordered)") +
ggtitle("Perceived vs Successor Centrality (Group-Level)")
plot_sr_centrality_group
## `geom_smooth()` using formula = 'y ~ x'

plot_sr_centrality_individual <- sr_centrality %>%
# Get each subject's rank-ordered perceived and SR-predicted centrality
group_by(sub_id) %>%
mutate(across(ends_with("_centrality"), ~rank(.x, ties.method = "min"))) %>%
ungroup() %>%
# Formatting for plotting
pivot_longer(
ends_with("_centrality"),
names_to = "metric",
values_to = "centrality"
) %>%
mutate(
node_id = factor(node_id),
sub_id = str_pad(sub_id, width = 2, side = "left", pad = "0"),
sub_id = str_c("Manager ", sub_id)
) %>%
ggplot(aes(x=node_id, y=centrality, fill=metric)) +
facet_wrap(~sub_id, nrow = 3) +
geom_col(position = "identity", color = "black", alpha = 0.5) +
coord_polar() +
scale_fill_viridis_d(
name = "Centrality",
labels = c("Perceived (Empirical)", "Successor Representation (Simulated)")
) +
ggtitle("Perceived vs Successor Centrality (Individual-Level)") +
theme_bw() +
theme(
plot.title = element_text(hjust = 0.5, size = 13),
plot.subtitle = element_text(hjust = 0.5, size = 11),
panel.grid.minor = element_blank(),
panel.spacing = unit(0.75, "lines"),
legend.box.spacing = unit(0.5, "lines"),
legend.margin = margin(c(0, 0, 0, 0), unit = "lines"),
legend.position = "bottom",
axis.title.x = element_blank(),
axis.title.y = element_blank(),
axis.text.y = element_blank(),
axis.ticks.y = element_blank()
)
plot_sr_centrality_individual

if (knitting) {
ggsave(
here("outputs", "07-krackhardt", "sr_centrality_individual_standard.pdf"),
plot = plot_sr_centrality_individual,
width = 12, height = 8,
device = cairo_pdf, units = "in", dpi = 300
)
}
We’re particularly interested in the manager pairs 1-7 and 18-21. On
an individual-by-individual basis, does the model correctly predict, in
each pair, which manager is represented as being more central? We’ll do
a prevalence test to see whether the model made correct predictions for
significantly greater than 50% of subjects.
sr_centrality %>%
# Find and retain only relevant node-pairs
mutate(
pair_id = case_when(
node_id %in% c(1, 7) ~ "1-7",
node_id %in% c(18, 21) ~ "18-21"
)
) %>%
drop_na() %>%
# Whichever manager is represented as being more central in each node-pair,
# does the model make a correct prediction?
group_by(sub_id, pair_id) %>%
mutate(
across(
c(sr_centrality, perceived_centrality),
~.x - first(.x)
)
) %>%
filter(node_id %in% c(7, 21)) %>%
summarise(
correct = sr_centrality * perceived_centrality > 0,
.groups = "drop"
) %>%
# Tally correct predictions
count(pair_id, correct) %>%
# Prevalence test
filter(correct) %>%
split(.$pair_id) %>%
map_dfr(
.f = ~.x %>%
select(n) %>%
deframe() %>%
binom.test(x = ., n = 21, p = 0.5, alternative = "greater") %>%
tidy(),
.id = "pair_id"
) %>%
kable(caption = "Prevalence test (> 50%)")
Prevalence test (> 50%)
| 1-7 |
0.7142857 |
15 |
0.0391769 |
21 |
0.5126112 |
1 |
Exact binomial test |
greater |
| 18-21 |
0.8571429 |
18 |
0.0007448 |
21 |
0.6707890 |
1 |
Exact binomial test |
greater |
We’d previously noted that betweenness seems to track discrepancies
between a manager’s true versus perceived centrality. Does SR centrality
track with betweenness? The loess curve suggests the relationship is
nonlinear, but there is a strongly significant rank correlation between
the two.
sr_centrality %>%
group_by(name = node_id) %>%
summarise(sr_centrality = mean(sr_centrality)) %>%
left_join(centrality_differences) %>%
ggplot(aes(x=sr_centrality, y=Betweenness)) +
gg +
geom_smooth(method = "loess", se = FALSE) +
geom_label(aes(label = name)) +
xlab("Successor Representation \u03B3 = 0.7")
## Joining with `by = join_by(name)`
## `geom_smooth()` using formula = 'y ~ x'

if (knitting) {
ggsave(
here("outputs", "07-krackhardt", "centrality_sr_vs_betweenness.pdf"),
width = 6, height = 4,
device = cairo_pdf, units = "in", dpi = 300
)
}
## `geom_smooth()` using formula = 'y ~ x'
sr_centrality %>%
group_by(name = node_id) %>%
summarise(sr_centrality = mean(sr_centrality)) %>%
left_join(centrality_differences) %>%
with(cor.test(Betweenness, sr_centrality, method = "spearman", exact = FALSE)) %>%
tidy() %>%
kable(caption = "Betweenness vs SR gamma 0.7")
## Joining with `by = join_by(name)`
Betweenness vs SR gamma 0.7
| 0.8948052 |
162 |
0 |
Spearman’s rank correlation rho |
two.sided |
Company leadership
So far, we have not provided the model with any privileged knowledge
about the company’s senior leadership hierarchy. Yet, using observations
constrained by the network structure, the model has identified most of
the company’s leadership anyways.
The CEO is coded as node 7, and the four Vice Presidents are coded as
2, 14, 18, and 21. We see clear evidence that the model identifies nodes
2, 7, 18, and 21 as being central figures, but the model makes
conspicuously incorrect predictions about manager 14.
Could the model in principle do a better job at identifying manager
14 as a central network member? To test this, we’ll make one additional
assumption about managers’ observations: they observe the senior
leadership twice as much as other network members.
sr_observations_leadership <- obs_edgelist_out %>%
# Double-count CEO and VPs
bind_rows(
obs_edgelist_out %>%
filter(from %in% c(2, 7, 14, 18, 21) | to %in% c(2, 7, 14, 18, 21))
) %>%
expand_grid(rep = 1:100) %>%
group_by(sub_id, rep) %>%
slice_sample(prop = 1) %>%
ungroup()
sr_leadership <- sr_observations_leadership %>%
expand_grid(gamma = seq(0.1, 0.9, 0.1), .) %>%
group_by(sub_id, gamma) %>%
nest() %>%
mutate(
sr = map2(
.x = data, .y = gamma,
~build_rep_sr(.x, this_alpha = 0.1, this_gamma = .y)
)
) %>%
unnest(sr) %>%
ungroup() %>%
select(-data) %>%
mutate(gamma = gamma * 10) %>%
pivot_wider(
names_from = gamma,
names_prefix = "sr",
values_from = sr_value
)
We’ll test whether the model goodness-of-fit is better at \(\gamma = 0.7\), which was the best-fitting
model from before:
test_sr7_leadership <- krackhardt_allocentric %>%
rename(choice = edge) %>%
left_join(schema_experts_out, by = c("sub_id", "from", "to")) %>%
left_join(sr_leadership, by = c("sub_id", "from", "to")) %>%
glmer(
choice ~ mem_value + sr7 + (1 + sr7 | sub_id),
family = "binomial", data = .
)
bind_rows(
glance(test_sr7_out) %>% mutate(model = "sr7 standard"),
glance(test_sr7_leadership) %>% mutate(model = "sr7 leadership 2x")
) %>%
kable()
| 8400 |
1 |
-4363.029 |
8738.058 |
8780.274 |
8565.413 |
8394 |
sr7 standard |
| 8400 |
1 |
-4301.586 |
8615.172 |
8657.388 |
8443.772 |
8394 |
sr7 leadership 2x |
sr_centrality_leadership <- sr_leadership %>%
# SR centrality from best variant
filter(from != to) %>%
group_by(sub_id, node_id = to) %>%
summarise(sr_centrality = mean(sr7), .groups = "drop") %>%
# Perceived centrality
left_join(
krackhardt_allocentric %>%
group_by(sub_id, node_id = to) %>%
summarise(perceived_centrality = mean(edge), .groups = "drop"),
by = c("sub_id", "node_id")
)
sr_centrality_corr_leadership <- sr_centrality_leadership %>%
group_by(sub_id) %>%
nest() %>%
mutate(
spearman = map_dbl(
.x = data,
.f = ~with(
.x, cor(
sr_centrality, perceived_centrality, method = "spearman"
)
)
)
) %>%
ungroup() %>%
select(-data)
sr_centrality_corr_leadership %>%
mutate(spearman = atanh(spearman)) %>%
select(spearman) %>%
deframe() %>%
t.test() %>%
tidy() %>%
kable(
caption = str_c(
"t-test of the average Spearman correlation between perceived and ",
"SR-predicted centrality (z-transformed)"
)
)
t-test of the average Spearman correlation between perceived
and SR-predicted centrality (z-transformed)
| 0.5811331 |
11.88074 |
0 |
20 |
0.4791005 |
0.6831656 |
One Sample t-test |
two.sided |
sr_centrality_corr_leadership %>%
summarise(mean_spearman = mean(spearman)) %>%
kable()
plot_sr_centrality_group_leadership <- sr_centrality_leadership %>%
# Get each subject's rank-ordered perceived and SR-predicted centrality
group_by(sub_id) %>%
mutate(across(ends_with("_centrality"), ~rank(.x, ties.method = "min"))) %>%
ungroup() %>%
# Average across subjects to get group-level results
group_by(node_id) %>%
summarise(across(ends_with("_centrality"), mean)) %>%
# For color-coding two pairs of managers with equivalent degree centrality,
# but different perceived centrality
mutate(
colorcode = case_when(
node_id %in% c(1, 7) ~ "a",
node_id %in% c(18, 21) ~ "b"
)
) %>%
ggplot(aes(x=perceived_centrality, y=sr_centrality)) +
gg +
geom_smooth(method = "lm", se = FALSE, color = "black") +
geom_label(aes(label = node_id, fill = colorcode), show.legend = FALSE) +
scale_fill_manual(values = c("#a6cee3", "#b2df8a"), na.value = "grey95") +
scale_x_continuous(name = "Perceived centrality (rank-ordered)") +
scale_y_continuous(name = "SR centrality (rank-ordered)") +
ggtitle("Perceived vs Successor Centrality (Group-Level)")
plot_sr_centrality_group_leadership
## `geom_smooth()` using formula = 'y ~ x'

plot_sr_centrality_individual_leadership <- sr_centrality_leadership %>%
# Get each subject's rank-ordered perceived and SR-predicted centrality
group_by(sub_id) %>%
mutate(across(ends_with("_centrality"), ~rank(.x, ties.method = "min"))) %>%
ungroup() %>%
# Formatting for plotting
pivot_longer(
ends_with("_centrality"),
names_to = "metric",
values_to = "centrality"
) %>%
mutate(
node_id = factor(node_id),
sub_id = str_pad(sub_id, width = 2, side = "left", pad = "0"),
sub_id = str_c("Manager ", sub_id)
) %>%
ggplot(aes(x=node_id, y=centrality, fill=metric)) +
facet_wrap(~sub_id, nrow = 3) +
geom_col(position = "identity", color = "black", alpha = 0.5) +
coord_polar() +
scale_fill_viridis_d(
name = "Centrality",
labels = c("Perceived (Empirical)", "Successor Representation (Simulated)")
) +
ggtitle("Perceived vs Successor Centrality (Individual-Level)") +
theme_bw() +
theme(
plot.title = element_text(hjust = 0.5, size = 13),
plot.subtitle = element_text(hjust = 0.5, size = 11),
panel.grid.minor = element_blank(),
panel.spacing = unit(0.75, "lines"),
legend.box.spacing = unit(0.5, "lines"),
legend.margin = margin(c(0, 0, 0, 0), unit = "lines"),
legend.position = "bottom",
axis.title.x = element_blank(),
axis.title.y = element_blank(),
axis.text.y = element_blank(),
axis.ticks.y = element_blank()
)
plot_sr_centrality_individual_leadership

if (knitting) {
ggsave(
here("outputs", "07-krackhardt", "sr_centrality_group_leadership.pdf"),
plot = plot_sr_centrality_group_leadership,
width = 6, height = 4,
device = cairo_pdf, units = "in", dpi = 300
)
ggsave(
here("outputs", "07-krackhardt", "sr_centrality_individual_leadership.pdf"),
plot = plot_sr_centrality_individual_leadership,
width = 12, height = 8,
device = cairo_pdf, units = "in", dpi = 300
)
}
## `geom_smooth()` using formula = 'y ~ x'
Figures for paper
network_for_paper <- krackhardt_g %>%
mutate(
Degree = centrality_degree(mode = "in"),
node_colorcode = case_when(
name %in% c(1, 7) ~ "a",
name %in% c(18, 21) ~ "b"
)
) %>%
# To clean up some of the overplotting
activate("edges") %>%
mutate(
edge_colorcode = case_when(
from %in% c(1, 7) | to %in% c(1, 7) ~ "a",
from %in% c(18, 21) | to %in% c(18, 21) ~ "b"
)
) %>%
# Plot the "ground truth" network
ggraph(layout = "stress") +
geom_edge_link(
aes(
start_cap = label_rect(node1.name, padding = margin(15, 15, 15, 15)),
end_cap = label_rect(node2.name, padding = margin(15, 15, 15, 15)),
color = edge_colorcode
),
arrow = arrow(length = unit(5, "pt"), type = "closed"),
show.legend = FALSE
) +
geom_node_label(
aes(label = name, size = Degree, fill = node_colorcode),
show.legend = FALSE
) +
scale_size(range = c(3, 10)) +
scale_fill_manual(values = c("#a6cee3", "#b2df8a"), na.value = "grey95") +
scale_edge_color_manual(
values = c("#1f78b4", "#33a02c"), na.value = "grey80"
) +
ggtitle("Advice Network (Krackhardt 1987)") +
theme(
panel.background = element_blank(),
strip.background = element_blank(),
panel.border = element_blank(),
axis.title = element_blank(),
axis.text = element_blank(),
axis.ticks = element_blank(),
plot.title = element_text(hjust = 0.5, size = 13),
strip.text = element_text(hjust = 0.5, size = 13),
legend.position = "bottom"
)
network_for_paper

if (knitting) {
ggsave(
here("outputs", "07-krackhardt", "network_graph.pdf"),
plot = network_for_paper,
width = 12, height = 5,
device = cairo_pdf, units = "in", dpi = 300
)
}
representation_for_paper <- (
(plot_choice + ggtitle("Perceived (Empirical)")) |
plot_sr7_out + ggtitle("Successor Rep. \u03B3 = 0.7 (Simulated)")
) +
plot_annotation(
title = "Network Representation",
theme = theme(plot.title = element_text(size = 13, hjust = 0.5))
)
representation_for_paper

if (knitting) {
ggsave(
here("outputs", "07-krackhardt", "representation.pdf"),
plot = representation_for_paper,
width = 12, height = 5,
device = cairo_pdf, units = "in", dpi = 300
)
}
centrality_individual_for_paper <- sr_centrality %>%
filter(sub_id %in% c(17, 19)) %>%
# Get each subject's rank-ordered perceived and SR-predicted centrality
group_by(sub_id) %>%
mutate(across(ends_with("_centrality"), ~rank(.x, ties.method = "min"))) %>%
ungroup() %>%
# Formatting for plotting
pivot_longer(
ends_with("_centrality"),
names_to = "metric",
values_to = "centrality"
) %>%
mutate(
node_id = factor(node_id),
sub_id = str_pad(sub_id, width = 2, side = "left", pad = "0"),
sub_id = str_c("Manager ", sub_id)
) %>%
ggplot(aes(x=node_id, y=centrality, fill=metric)) +
facet_wrap(~sub_id) +
geom_col(position = "identity", color = "black", alpha = 0.5) +
coord_polar() +
scale_fill_viridis_d(
name = "Centrality",
labels = c("Perceived (Empirical)", "Successor Rep. (Simulated)")
) +
ggtitle("Individual-Level") +
theme_bw() +
theme(
plot.title = element_text(hjust = 0.5, size = 13),
plot.subtitle = element_text(hjust = 0.5, size = 11),
panel.grid.minor = element_blank(),
panel.spacing = unit(0.75, "lines"),
legend.box.spacing = unit(0.5, "lines"),
legend.margin = margin(c(0, 0, 0, 0), unit = "lines"),
legend.position = "bottom",
axis.title.x = element_blank(),
axis.title.y = element_blank(),
axis.text.y = element_blank(),
axis.ticks.y = element_blank()
)
centrality_for_paper <- (
(
plot_sr_centrality_group +
ggtitle("Group-Level") +
scale_y_continuous(name = "Successor Rep. centrality (rank-ordered)")
) |
centrality_individual_for_paper
) +
plot_annotation(
title = "Perceived vs Successor Centrality",
theme = theme(plot.title = element_text(size = 13, hjust = 0.5))
)
## Scale for y is already present.
## Adding another scale for y, which will replace the existing scale.
centrality_for_paper
## `geom_smooth()` using formula = 'y ~ x'

if (knitting) {
ggsave(
here("outputs", "07-krackhardt", "centrality.pdf"),
plot = centrality_for_paper,
width = 12, height = 5,
device = cairo_pdf, units = "in", dpi = 300
)
}
## `geom_smooth()` using formula = 'y ~ x'
Save data
It was a bit of a pain getting the raw data into shape, so we’ll save
the cleaned versions for posterity.
if (knitting) {
krackhardt_egocentric %>%
write_csv(here("data", "krackhardt", "clean_egocentric.csv"))
krackhardt_allocentric %>%
rename(choice = edge) %>%
write_csv(here("data", "krackhardt", "clean_allocentric.csv"))
}
Session info
sessionInfo()
## R version 4.3.1 (2023-06-16)
## Platform: aarch64-apple-darwin20 (64-bit)
## Running under: macOS Ventura 13.6.1
##
## Matrix products: default
## BLAS: /Library/Frameworks/R.framework/Versions/4.3-arm64/Resources/lib/libRblas.0.dylib
## LAPACK: /Library/Frameworks/R.framework/Versions/4.3-arm64/Resources/lib/libRlapack.dylib; LAPACK version 3.11.0
##
## locale:
## [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
##
## time zone: America/New_York
## tzcode source: internal
##
## attached base packages:
## [1] stats graphics grDevices utils datasets methods base
##
## other attached packages:
## [1] broom.mixed_0.2.9.4 lmerTest_3.1-3 lme4_1.1-34
## [4] Matrix_1.6-1 patchwork_1.1.3 ggraph_2.1.0
## [7] tidygraph_1.2.3 here_1.0.1 lubridate_1.9.2
## [10] forcats_1.0.0 stringr_1.5.0 dplyr_1.1.2
## [13] purrr_1.0.2 readr_2.1.4 tidyr_1.3.0
## [16] tibble_3.2.1 ggplot2_3.4.3 tidyverse_2.0.0
##
## loaded via a namespace (and not attached):
## [1] tidyselect_1.2.0 viridisLite_0.4.2 farver_2.1.1
## [4] viridis_0.6.4 fastmap_1.1.1 tweenr_2.0.2
## [7] digest_0.6.33 timechange_0.2.0 lifecycle_1.0.3
## [10] magrittr_2.0.3 compiler_4.3.1 rlang_1.1.1
## [13] sass_0.4.7 tools_4.3.1 igraph_1.5.1
## [16] utf8_1.2.3 yaml_2.3.7 knitr_1.43
## [19] labeling_0.4.2 graphlayouts_1.0.0 bit_4.0.5
## [22] abind_1.4-5 withr_2.5.0 numDeriv_2016.8-1.1
## [25] grid_4.3.1 polyclip_1.10-4 fansi_1.0.4
## [28] colorspace_2.1-0 future_1.33.0 globals_0.16.2
## [31] scales_1.2.1 MASS_7.3-60 cli_3.6.1
## [34] crayon_1.5.2 rmarkdown_2.24 generics_0.1.3
## [37] tzdb_0.4.0 minqa_1.2.5 cachem_1.0.8
## [40] ggforce_0.4.1 splines_4.3.1 parallel_4.3.1
## [43] vctrs_0.6.3 boot_1.3-28.1 jsonlite_1.8.7
## [46] carData_3.0-5 car_3.1-2 hms_1.1.3
## [49] bit64_4.0.5 ggrepel_0.9.3 listenv_0.9.0
## [52] jquerylib_0.1.4 glue_1.6.2 parallelly_1.36.0
## [55] nloptr_2.0.3 codetools_0.2-19 stringi_1.7.12
## [58] gtable_0.3.3 munsell_0.5.0 pillar_1.9.0
## [61] furrr_0.3.1 htmltools_0.5.6 R6_2.5.1
## [64] rprojroot_2.0.3 vroom_1.6.3 evaluate_0.21
## [67] lattice_0.21-8 highr_0.10 backports_1.4.1
## [70] broom_1.0.5 bslib_0.5.1 Rcpp_1.0.11
## [73] gridExtra_2.3 nlme_3.1-162 mgcv_1.8-42
## [76] xfun_0.40 fs_1.6.3 pkgconfig_2.0.3
---
title: "Reanalyze Krackhardt 1987"
output:
  html_document:
    code_download: true
    code_folding: hide
    toc: true
    toc_float:
      collapsed: true
---

In his 1987 paper (Cognitive Social Structures), Krackhardt includes data from a 21-member network of managers (all male) at a tech manufacturing company. These data include not only egocentric reports (these are the people *I* go to for advice), but also allocentric reports (I think John often asks Bill for advice). We can try testing whether multistep relational abstraction explains the subjects' allocentric beliefs in this dataset.

# Setup

```{r setup}
# Set a random seed for reproducibility
set.seed(sum(utf8ToInt("krackhardt")))

library(tidyverse)
library(here)
library(tidygraph)
library(ggraph)
library(patchwork)
library(lme4)
library(lmerTest)
library(broom.mixed)

kable <- knitr::kable
vif <- car::vif

knitting <- knitr::is_html_output()

if (knitting) {
  # Create directories for saving stuff
  if (!dir.exists(here("outputs"))) {
    dir.create(here("outputs"))
  }
  
  if (!dir.exists(here("outputs", "07-krackhardt"))) {
    dir.create(here("outputs", "07-krackhardt"))
  }
}

# Pull in some modeling tools
source(here("code", "utils", "representation_utils.R"))
source(here("code", "utils", "network_utils.R"))

gg <- list(
  theme_bw(),
  theme(
    plot.title = element_text(hjust = 0.5, size = 13),
    plot.subtitle = element_text(hjust = 0.5, size = 11),
    panel.grid.major = element_blank(),
    panel.grid.minor = element_blank(),
    panel.spacing = unit(0.75, "lines"),
    legend.box.spacing = unit(0.5, "lines"),
    legend.margin = margin(c(0, 0, 0, 0), unit = "lines")
  )
)

plot_representation <- function(rep_df, relation_value, n_breaks = 3) {
  plot_df <- rep_df %>%
    filter(from != to) %>%
    group_by(from, to) %>%
    summarise(value = mean({{relation_value}}), .groups = "drop")
  
  plot_df %>%
    # Plot
    mutate(
      across(c(from, to), factor),
      from = fct_rev(from)
    ) %>%
    ggplot(aes(x=to, y=from, fill=value)) +
    gg +
    geom_tile() +
    coord_fixed() +
    scale_fill_viridis_c(
      option = "magma", name = NULL, na.value = "white", n.breaks = n_breaks
    ) +
    theme(
      legend.position = "bottom",
      axis.text.x = element_text(angle = 90, vjust = 0.5, hjust = 1)
    )
}

check_significance <- function(tidy_stats) {
  tidy_stats %>%
    mutate(
      significance = case_when(
        p.value < 0.001 ~ "***",
        p.value < 0.01 ~ "**",
        p.value < 0.05 ~ "*",
        p.value < 0.1 ~ ".",
        TRUE ~ ""
      )
    )
}
```

These data are included in Krackhardt 1987, Appendix A. Each allocentric CSV contains an adjacency matrix, so we'll need to do a little work to get them into a tidy-friendly format.

The "egocentric" network is defined in the way that social networks are most often defined: based on each subject's answer to the question, "Who are you connected to?" Oftentimes, this question is phrased to measure friendships (e.g., "Are you friends with X?"). In this particular dataset, the question is something closer to "Who do you ask for advice?"

The "allocentric" cognitive maps measure each subject's answer to the question, "Does X ask Y for advice?" This includes the special case of, "Does X ask you for advice?" Here's the rationale for including the special case in allocentric maps. Imagine for a moment that this were a study about friendships instead of advice-giving. In that case, we would consider "Are you friends with X?" to be a question that subjects would know the answer to (or at least, they would know better than anyone else). On the other hand, we would consider the question "Does X consider you a friend?" to be more of an inference than knowledge. I don't know whether X considers me a friend; I don't consider X to be a friend, but we do occasionally eat lunch together. Moreover, in cases of disagreement (X says he asks Y for advice; Y doesn't agree), the "ground truth" egocentric network would then hinge on the researcher's decision to use an intersection vs union rule to resolve disagreements; Krackhardt notes this in his paper. By alternatively defining the egocentric network based solely on the responses that subjects would have greatest introspective access to (their own advice-seeking behavior), we sidestep this researcher degree of freedom.

```{r clean-data}
krackhardt_raw <- here("data", "krackhardt") %>%
  fs::dir_ls(regexp = "krackhardt_tech_managers_sub[[:digit:]]{2}\\.csv") %>%
  map_dfr(
    .f = ~read_csv(.x, col_names = FALSE, show_col_types = FALSE),
    .id = "filename"
  ) %>%
  mutate(
    sub_id = str_extract(filename, "sub[[:digit:]]{2}"),
    sub_id = str_remove(sub_id, "sub"),
    sub_id = as.numeric(sub_id)
  ) %>%
  select(-filename) %>%
  group_by(sub_id) %>%
  mutate(from = row_number()) %>%
  ungroup() %>%
  pivot_longer(
    cols = -c(sub_id, from),
    names_to = "to",
    values_to = "edge"
  ) %>%
  mutate(
    to = str_remove(to, "X"),
    to = as.integer(to)
  )

krackhardt_egocentric <- krackhardt_raw %>%
  filter(sub_id == from, from != to) %>%
  select(-sub_id)

krackhardt_allocentric <- krackhardt_raw %>%
  filter(sub_id != from, from != to)
```


# Plot data

Okay, let's try and get a sense for what the "egocentric" network looks like.

```{r network-graph}
krackhardt_g <- krackhardt_egocentric %>%
  filter(edge == 1) %>%
  select(-edge) %>% 
  tbl_graph(edges = .) %>%
  mutate(name = row_number())

plot_network_graph <- krackhardt_g %>%
  mutate(Degree = centrality_degree(mode = "in")) %>%
  # Plot the "ground truth" network
  ggraph(layout = "stress") +
  geom_edge_link(
    aes(
      start_cap = label_rect(node1.name, padding = margin(15, 15, 15, 15)),
      end_cap = label_rect(node2.name, padding = margin(15, 15, 15, 15)),
    ),
    arrow = arrow(length = unit(5, "pt"), type = "closed")
  ) +
  geom_node_label(
    aes(label = name, size = Degree),
    fill = "grey95", show.legend = FALSE
  ) +
  scale_size(range = c(3, 10)) +
  theme(
    panel.background = element_blank(),
    strip.background = element_blank(),
    panel.border = element_blank(),
    axis.title = element_blank(),
    axis.text = element_blank(),
    axis.ticks = element_blank(),
    plot.title = element_text(hjust = 0.5, size = 13),
    strip.text = element_text(hjust = 0.5, size = 13),
    legend.position = "bottom"
  )

plot_network_graph
```

We can also plot this as a matrix instead of a graph, which will ease comparison with the cognitive strategies we'll consider next.

```{r network-adjacency}
plot_network_adj <- krackhardt_egocentric %>%
  plot_representation(edge, n_breaks = 2) +
  ggtitle("Egocentric Network")

plot_network_adj
```

How does this compare to subjects' mental representation of the network?

```{r plot-representation}
plot_choice <- krackhardt_allocentric %>%
  plot_representation(relation_value = edge) +
  ggtitle("Network Representation")

plot_network_adj | plot_choice
```

In the original paper, Krackhardt makes the following note: "There are some centers of focus for advice, notably 2 and 21 (both are vice presidents) ... In [the subjectively perceived network], 21 loses his prominence as a major recipient of nominations: instead, 18, 14, and 7 (the president) appear to be more central." To get a quantitative sense for this in our data, we can try calculating the "perceived" degree centrality by summing over the matrix columns, then comparing that against the "true" in-degree centrality.

```{r calc-centrality-diff}
centrality_differences <- krackhardt_g %>%
  mutate(
    Degree = centrality_degree(mode = "in"),
    Betweenness = centrality_betweenness(directed = TRUE)
  ) %>%
  as_tibble() %>%
  # Add represented in-degree; standardize sum by number of network members
  left_join(
    krackhardt_allocentric %>%
      group_by(name = to) %>%
      summarise(Perceived = sum(edge)/21, .groups = "drop"),
    by = c("name")
  ) %>%
  select(name, Perceived, everything()) %>%
  arrange(desc(Degree), desc(Perceived))
```

We can see that the same number of managers go to subjects 18 and 21 for advice (15 each), but that subject 18 is perceived as being much more "popular" (averaged across subjects, 11.95 managers are perceived to go to him for advice) than subject 21 (averaged across subjects, 8.38 managers are perceived to go to him for advice). A similar phenomenon happens with subjects 1 and 7, where subject 1 is perceived as being much less popular than subject 7, despite their having the same "true" popularity.

```{r centrality-diff-degree}
centrality_differences %>%
  ggplot(aes(x=Degree, y=Perceived)) +
  gg +
  geom_smooth(method = "lm", se = FALSE) +
  geom_label(aes(label = name)) +
  scale_x_continuous(name = "True (degree) centrality") +
  scale_y_continuous(name = "Perceived centrality")

centrality_differences %>%
  select(name, Degree, Perceived) %>%
  kable()
```

Is it possible that subjects' network representations might be reflecting some other centrality metric? In the original paper, Krackhardt notes that betweenness seems to have some predictive power. For the 18-21 and 1-7 test cases, we see that betweenness centrality does make the "correct" prediction about which network member is represented as being more important. But, we should note that betweenness doesn't do the greatest job predicting perceived popularity across the board; its greatest predictive power is in explaining the popularity discrepancy for the most popular managers.

```{r centrality-differences-between}
centrality_differences %>%
  pivot_longer(cols = Degree:Betweenness, names_to = "metric") %>%
  ggplot(aes(x=value, y=Perceived)) +
  gg +
  facet_wrap(~metric, scales = "free_x") +
  geom_smooth(method = "lm", se = FALSE) +
  geom_label(aes(label = name)) +
  scale_x_continuous(name = "True centrality") +
  scale_y_continuous(name = "Perceived centrality")

centrality_differences %>%
  kable()
```


# Simulated observations

We know that an individual's embedding within a network dictates what interactions and/or relationships they're able to observe. Therefore, when building predicted cognitive maps, we should use a different set of interactions to inform each subject's predicted representations. Of course, we have no idea what individual subjects' observations actually are. So we'll need to come up with a compact and reasonable set of assumptions about what observations were most likely available to individual network members.

Here are three key assumptions we'll make here:

1. We'll assume (trivially) that people can observe interactions between themselves and their immediate advisors (and/or advisees).
2. We'll also assume that people are generally able to observe interactions between advisors-of-advisors (and/or advisees-of-advisees), either through firsthand observation or secondhand chatter.
3. Though of course it's plausible that people might observe more distant interactions, we'll assume this happens with enough infrequency that we can ignore those interactions for the sake of modeling.

We've left open an ambiguity in those assumptions: are people observing advisors-of-advisors, and/or advisees-of-advisees? To aid understanding, we'll use the following example. Suppose we're looking at a subset of a network consisting of the grad student Abby, who asks the postdoc Betty for advice; in turn, Betty the postdoc asks Cathy the assistant professor for advice; and Cathy asks Dani the associate professor for advice. There are three reasonable ways that observations might work in this network:

1. Observations go "out": When Abby asks Betty for advice, Betty responds by sharing the wisdom she's gained from asking Cathy for advice. That leads Abby to observe that Betty goes to Cathy for advice.
2. Observations go "in": Betty has asked Cathy for advice. Cathy thinks Betty has asked a great question that she doesn't know the answer to, so Cathy asks Dani about it. That leads Dani to observe that Betty goes to Cathy for advice.
3. It goes both ways.

We may be able to test this empirically. We would need to simulate a set of observations consistent with all three possibilities, generate model predictions from those simulations, then compare how well those models fit managers' actual perceptions of the network.

```{r def-observations}
# Observations go both ways
obs_edgelist_all <- expand_grid(
  sub_id = 1:21,
  # Start with all one-step egocentric observations
  krackhardt_egocentric
) %>%
  # Convert adjlist into edgelist
  filter(edge == 1) %>%
  select(-edge) %>%
  # For a given node, we want to know who their "advisors-of-advisors" are
  # (or equivalently, "advisees-of-advisees")
  left_join(
    krackhardt_g %>%
      mutate(neighbors = local_members(order = 1, mode = "all")) %>%
      as_tibble(),
    by = c("sub_id" = "name")
  ) %>%
  # We assume that nodes are able to observe (directly or through chatter)
  # advice-giving interactions from their advisors and advisors-of-advisors
  # (or equivalently, advisees and advisees-of-advisees)
  rowwise() %>%
  filter((from %in% neighbors) | (to %in% neighbors)) %>%
  ungroup() %>%
  select(-neighbors) %>%
  # Make pretty
  arrange(sub_id, from)

# Advisees observe who advises their advisors
obs_edgelist_out <- expand_grid(
  sub_id = 1:21,
  krackhardt_egocentric
) %>%
  filter(edge == 1) %>%
  select(-edge) %>%
  left_join(
    krackhardt_g %>%
      mutate(neighbors = local_members(order = 1, mode = "out")) %>%
      as_tibble(),
    by = c("sub_id" = "name")
  ) %>%
  rowwise() %>%
  filter((from %in% neighbors) | (to %in% neighbors)) %>%
  ungroup() %>%
  select(-neighbors) %>%
  arrange(sub_id, from)

# Advisors observe who their advisees advise
obs_edgelist_in <- expand_grid(
  sub_id = 1:21,
  krackhardt_egocentric
) %>%
  filter(edge == 1) %>%
  select(-edge) %>%
  left_join(
    krackhardt_g %>%
      mutate(neighbors = local_members(order = 1, mode = "in")) %>%
      as_tibble(),
    by = c("sub_id" = "name")
  ) %>%
  rowwise() %>%
  filter((from %in% neighbors) | (to %in% neighbors)) %>%
  ungroup() %>%
  select(-neighbors) %>%
  arrange(sub_id, from)

obs_adjlist_all <- expand_grid(sub_id = 1:21, from = sub_id, to = sub_id) %>%
  left_join(
    obs_edgelist_all %>% mutate(edge = 1),
    by = c("sub_id", "from", "to")
  ) %>%
  mutate(edge = if_else(is.na(edge), 0, edge))

obs_adjlist_out <- expand_grid(sub_id = 1:21, from = sub_id, to = sub_id) %>%
  left_join(
    obs_edgelist_out %>% mutate(edge = 1),
    by = c("sub_id", "from", "to")
  ) %>%
  mutate(edge = if_else(is.na(edge), 0, edge))

obs_adjlist_in <- expand_grid(sub_id = 1:21, from = sub_id, to = sub_id) %>%
  left_join(
    obs_edgelist_in %>% mutate(edge = 1),
    by = c("sub_id", "from", "to")
  ) %>%
  mutate(edge = if_else(is.na(edge), 0, edge))
```


# Schema-based representation

If people were using triad or community schemas/heuristics to inform representation, what would this look like? Using literally the exact same tools that we used to analyze the butterfly network, let's create some predicted representations.

```{r schema-experts}
schema_experts_all <- obs_adjlist_all %>%
  # Memorization
  group_by(sub_id, from) %>%
  mutate(
    mem_value = edge / sum(edge),
    mem_value = if_else(is.nan(mem_value), 0, mem_value)
  ) %>%
  ungroup() %>%
  # Now triads
  left_join(
    obs_adjlist_all %>%
      group_by(sub_id) %>%
      nest() %>%
      mutate(triad = map(data, ~build_rep_triad(.x, "edge"))) %>%
      unnest(triad) %>%
      ungroup() %>%
      select(-c(data, transition_scaling)) %>%
      mutate(triad_value = if_else(is.nan(triad_value), 0, triad_value)),
    by = c("sub_id", "from", "to")
  ) %>%
  # Finally communities
  left_join(
    obs_adjlist_all %>%
      group_by(sub_id) %>%
      nest() %>%
      mutate(comm = map(data, ~build_rep_community(.x, "edge"))) %>%
      unnest(comm) %>%
      ungroup() %>%
      select(-c(data, transition_scaling)),
    by = c("sub_id", "from", "to")
  )

schema_experts_out <- obs_adjlist_out %>%
  # Memorization
  group_by(sub_id, from) %>%
  mutate(
    mem_value = edge / sum(edge),
    mem_value = if_else(is.nan(mem_value), 0, mem_value)
  ) %>%
  ungroup() %>%
  # Now triads
  left_join(
    obs_adjlist_out %>%
      group_by(sub_id) %>%
      nest() %>%
      mutate(triad = map(data, ~build_rep_triad(.x, "edge"))) %>%
      unnest(triad) %>%
      ungroup() %>%
      select(-c(data, transition_scaling)) %>%
      mutate(triad_value = if_else(is.nan(triad_value), 0, triad_value)),
    by = c("sub_id", "from", "to")
  ) %>%
  # Finally communities
  left_join(
    obs_adjlist_out %>%
      group_by(sub_id) %>%
      nest() %>%
      mutate(comm = map(data, ~build_rep_community(.x, "edge"))) %>%
      unnest(comm) %>%
      ungroup() %>%
      select(-c(data, transition_scaling)),
    by = c("sub_id", "from", "to")
  )

schema_experts_in <- obs_adjlist_in %>%
  # Memorization
  group_by(sub_id, from) %>%
  mutate(
    mem_value = edge / sum(edge),
    mem_value = if_else(is.nan(mem_value), 0, mem_value)
  ) %>%
  ungroup() %>%
  # Now triads
  left_join(
    obs_adjlist_in %>%
      group_by(sub_id) %>%
      nest() %>%
      mutate(triad = map(data, ~build_rep_triad(.x, "edge"))) %>%
      unnest(triad) %>%
      ungroup() %>%
      select(-c(data, transition_scaling)) %>%
      mutate(triad_value = if_else(is.nan(triad_value), 0, triad_value)),
    by = c("sub_id", "from", "to")
  ) %>%
  # Finally communities
  left_join(
    obs_adjlist_in %>%
      group_by(sub_id) %>%
      nest() %>%
      mutate(comm = map(data, ~build_rep_community(.x, "edge"))) %>%
      unnest(comm) %>%
      ungroup() %>%
      select(-c(data, transition_scaling)),
    by = c("sub_id", "from", "to")
  )
```

```{r plot-schema-all}
plot_mem_all <- plot_representation(schema_experts_all, mem_value) +
  ggtitle("Memorization")

plot_triad_all <- plot_representation(schema_experts_all, triad_value) +
  ggtitle("Triad Completion")

plot_comm_all <- plot_representation(schema_experts_all, comm_value) +
  ggtitle("Community Detection")

plot_schemas_all <- (
  plot_choice | plot_mem_all | plot_triad_all | plot_comm_all
) +
  plot_annotation(
    title = "Observations of advice-giving go both ways",
    theme = theme(plot.title = element_text(size = 15, hjust = 0.5))
  )

plot_schemas_all

if (knitting) {
  ggsave(
    here("outputs", "07-krackhardt", "schema_gallery_all.pdf"),
    plot_schemas_all,
    width = 12, height = 4,
    device = cairo_pdf, units = "in", dpi = 300
  )
}
```

```{r plot-schema-out}
plot_mem_out <- plot_representation(schema_experts_out, mem_value) +
  ggtitle("Memorization")

plot_triad_out <- plot_representation(schema_experts_out, triad_value) +
  ggtitle("Triad Completion")

plot_comm_out <- plot_representation(schema_experts_out, comm_value) +
  ggtitle("Community Detection")

plot_schemas_out <- (
  plot_choice | plot_mem_out | plot_triad_out | plot_comm_out
) +
  plot_annotation(
    title = "Advisees observe advisors' advisors",
    theme = theme(plot.title = element_text(size = 15, hjust = 0.5))
  )

plot_schemas_out

if (knitting) {
  ggsave(
    here("outputs", "07-krackhardt", "schema_gallery_out.pdf"),
    plot_schemas_out,
    width = 12, height = 4,
    device = cairo_pdf, units = "in", dpi = 300
  )
}
```

```{r plot-schema-in}
plot_mem_in <- plot_representation(schema_experts_in, mem_value) +
  ggtitle("Memorization")

plot_triad_in <- plot_representation(schema_experts_in, triad_value) +
  ggtitle("Triad Completion")

plot_comm_in <- plot_representation(schema_experts_in, comm_value) +
  ggtitle("Community Detection")

plot_schemas_in <- (
  plot_choice | plot_mem_in | plot_triad_in | plot_comm_in
) +
  plot_annotation(
    title = "Advisors observe advisees' advisees",
    theme = theme(plot.title = element_text(size = 15, hjust = 0.5))
  )

plot_schemas_in

if (knitting) {
  ggsave(
    here("outputs", "07-krackhardt", "schema_gallery_in.pdf"),
    plot_schemas_in,
    width = 12, height = 4,
    device = cairo_pdf, units = "in", dpi = 300
  )
}
```

At the group-level... schemas make frankly terrible predictions, no matter what we assume about observations. Note that if a particular subject is predicted to ask all other managers for advice, the row associated with that subject will be filled with a value close to 1/20=0.05. So we can see that the triad completion strategy predicts that most managers are connected, and the community detection strategy predicts that literally all managers are connected. Neither comes even a little bit close to the actual allocentric network representation.


# Successor repesentation

Let's say people use multistep relational abstraction to represent networks, and let's use the SR to approximate that kind of abstraction. Using the observations we generated before, let's create lots of learning "trials" to train the algorithm.

```{r create-sr-obs}
sr_observations_all <- obs_edgelist_all %>%
  # Create simulated "learning trials" to train an asymptotic SR
  expand_grid(rep = 1:100) %>%
  # Randomize "order of trials" per repetition
  group_by(sub_id, rep) %>%
  slice_sample(prop = 1) %>%
  ungroup()

sr_observations_out <- obs_edgelist_out %>%
  expand_grid(rep = 1:100) %>%
  group_by(sub_id, rep) %>%
  slice_sample(prop = 1) %>%
  ungroup()

sr_observations_in <- obs_edgelist_in %>%
  expand_grid(rep = 1:100) %>%
  group_by(sub_id, rep) %>%
  slice_sample(prop = 1) %>%
  ungroup()
```

And now we can use those to build SRs at different scales. We'll assume that every subject is using the same successor horizon...

```{r compute-sr}
sr_all <- sr_observations_all %>%
  expand_grid(gamma = seq(0.1, 0.9, 0.1), .) %>%
  group_by(sub_id, gamma) %>%
  nest() %>%
  mutate(
    sr = map2(
      .x = data, .y = gamma,
      ~build_rep_sr(.x, this_alpha = 0.1, this_gamma = .y)
    )
  ) %>%
  unnest(sr) %>%
  ungroup() %>%
  select(-data) %>%
  mutate(gamma = gamma * 10) %>%
  pivot_wider(
    names_from = gamma,
    names_prefix = "sr",
    values_from = sr_value
  )

sr_out <- sr_observations_out %>%
  expand_grid(gamma = seq(0.1, 0.9, 0.1), .) %>%
  group_by(sub_id, gamma) %>%
  nest() %>%
  mutate(
    sr = map2(
      .x = data, .y = gamma,
      ~build_rep_sr(.x, this_alpha = 0.1, this_gamma = .y)
    )
  ) %>%
  unnest(sr) %>%
  ungroup() %>%
  select(-data) %>%
  mutate(gamma = gamma * 10) %>%
  pivot_wider(
    names_from = gamma,
    names_prefix = "sr",
    values_from = sr_value
  )

sr_in <- sr_observations_in %>%
  expand_grid(gamma = seq(0.1, 0.9, 0.1), .) %>%
  group_by(sub_id, gamma) %>%
  nest() %>%
  mutate(
    sr = map2(
      .x = data, .y = gamma,
      ~build_rep_sr(.x, this_alpha = 0.1, this_gamma = .y)
    )
  ) %>%
  unnest(sr) %>%
  ungroup() %>%
  select(-data) %>%
  mutate(gamma = gamma * 10) %>%
  pivot_wider(
    names_from = gamma,
    names_prefix = "sr",
    values_from = sr_value
  )
```

Let's get a sense for the predictions when the SR is trained on observations that go both ways.

```{r plot-sr-all}
plot_sr1_all <- sr_all %>%
  plot_representation(relation_value = sr1, n_breaks = 4) +
  ggtitle("\u03B3 = 0.1")

plot_sr3_all <- sr_all %>%
  plot_representation(relation_value = sr3) +
  ggtitle("\u03B3 = 0.3")

plot_sr5_all <- sr_all %>%
  plot_representation(relation_value = sr5) +
  ggtitle("\u03B3 = 0.5")

plot_sr7_all <- sr_all %>%
  plot_representation(relation_value = sr7) +
  ggtitle("\u03B3 = 0.7")

plot_sr9_all <- sr_all %>%
  plot_representation(relation_value = sr9) +
  ggtitle("\u03B3 = 0.9")

plot_sr_gallery_all <- (
  plot_choice +
    plot_sr1_all + plot_sr3_all + plot_sr5_all + plot_sr7_all + plot_sr9_all
)

plot_sr_gallery_all

if (knitting) {
  ggsave(
    here("outputs", "07-krackhardt", "sr_gallery_all.pdf"),
    plot_sr_gallery_all,
    width = 12, height = 8,
    device = cairo_pdf, units = "in", dpi = 300
  )
}
```

And now where advisees observe their advisors' advisors...

```{r plot-sr-out}
plot_sr1_out <- sr_out %>%
  plot_representation(relation_value = sr1, n_breaks = 4) +
  ggtitle("\u03B3 = 0.1")

plot_sr3_out <- sr_out %>%
  plot_representation(relation_value = sr3) +
  ggtitle("\u03B3 = 0.3")

plot_sr5_out <- sr_out %>%
  plot_representation(relation_value = sr5) +
  ggtitle("\u03B3 = 0.5")

plot_sr7_out <- sr_out %>%
  plot_representation(relation_value = sr7) +
  ggtitle("\u03B3 = 0.7")

plot_sr9_out <- sr_out %>%
  plot_representation(relation_value = sr9) +
  ggtitle("\u03B3 = 0.9")

plot_sr_gallery_out <- (
  plot_choice +
    plot_sr1_out + plot_sr3_out + plot_sr5_out + plot_sr7_out + plot_sr9_out
)

plot_sr_gallery_out

if (knitting) {
  ggsave(
    here("outputs", "07-krackhardt", "sr_gallery_out.pdf"),
    plot_sr_gallery_out,
    width = 12, height = 8,
    device = cairo_pdf, units = "in", dpi = 300
  )
}
```

And finally where advisors observe their advisees' advisees...

```{r plot-sr-in}
plot_sr1_in <- sr_in %>%
  plot_representation(relation_value = sr1, n_breaks = 4) +
  ggtitle("\u03B3 = 0.1")

plot_sr3_in <- sr_in %>%
  plot_representation(relation_value = sr3) +
  ggtitle("\u03B3 = 0.3")

plot_sr5_in <- sr_in %>%
  plot_representation(relation_value = sr5) +
  ggtitle("\u03B3 = 0.5")

plot_sr7_in <- sr_in %>%
  plot_representation(relation_value = sr7) +
  ggtitle("\u03B3 = 0.7")

plot_sr9_in <- sr_in %>%
  plot_representation(relation_value = sr9) +
  ggtitle("\u03B3 = 0.9")

plot_sr_gallery_in <- (
  plot_choice +
    plot_sr1_in + plot_sr3_in + plot_sr5_in + plot_sr7_in + plot_sr9_in
)

plot_sr_gallery_in

if (knitting) {
  ggsave(
    here("outputs", "07-krackhardt", "sr_gallery_in.pdf"),
    plot_sr_gallery_in,
    width = 12, height = 8,
    device = cairo_pdf, units = "in", dpi = 300
  )
}
```


# Statistical tests

Okay, now that we have all of the predicted strategies in hand, we'll want to perform some formal statistical tests to verify that the SR is truly doing the best at explaining subjects' network representations. Let's create a dataframe with all of the info we'll need.

```{r define-test-dataset}
test_dataset_all <- krackhardt_allocentric %>%
  rename(choice = edge) %>%
  left_join(schema_experts_all, by = c("sub_id", "from", "to")) %>%
  left_join(sr_all, by = c("sub_id", "from", "to"))

test_dataset_out <- krackhardt_allocentric %>%
  rename(choice = edge) %>%
  left_join(schema_experts_out, by = c("sub_id", "from", "to")) %>%
  left_join(sr_out, by = c("sub_id", "from", "to"))

test_dataset_in <- krackhardt_allocentric %>%
  rename(choice = edge) %>%
  left_join(schema_experts_in, by = c("sub_id", "from", "to")) %>%
  left_join(sr_in, by = c("sub_id", "from", "to"))
```

We'll do all of the statistical tests for the "observations go both ways" models. Note that we can't actually estimate a model for community detection, as this strategy predicts that everyone is connected to everyone.

```{r test-obs-all}
test_mem_all <- test_dataset_all %>%
  glmer(
    choice ~ mem_value + (1 + mem_value | sub_id),
    family = "binomial", data = .
  )

test_triad_all <- test_dataset_all %>%
  glmer(
    choice ~ mem_value + triad_value + (1 + triad_value | sub_id),
    family = "binomial", data = .
  )

# test_comm_all <- test_dataset_all %>%
#   glmer(
#     choice ~ mem_value + comm_value + (1 + comm_value | sub_id),
#     family = "binomial", data = .
#   )

test_sr1_all <- test_dataset_all %>%
  glmer(
    choice ~ mem_value + sr1 + (1 + sr1 | sub_id),
    family = "binomial", data = .
  )

test_sr2_all <- test_dataset_all %>%
  glmer(
    choice ~ mem_value + sr2 + (1 + sr2 | sub_id),
    family = "binomial", data = .
  )

test_sr3_all <- test_dataset_all %>%
  glmer(
    choice ~ mem_value + sr3 + (1 + sr3 | sub_id),
    family = "binomial", data = .
  )

test_sr4_all <- test_dataset_all %>%
  glmer(
    choice ~ mem_value + sr4 + (1 + sr4 | sub_id),
    family = "binomial", data = .
  )

test_sr5_all <- test_dataset_all %>%
  glmer(
    choice ~ mem_value + sr5 + (1 + sr5 | sub_id),
    family = "binomial", data = .
  )

test_sr6_all <- test_dataset_all %>%
  glmer(
    choice ~ mem_value + sr6 + (1 + sr6 | sub_id),
    family = "binomial", data = .
  )

test_sr7_all <- test_dataset_all %>%
  glmer(
    choice ~ mem_value + sr7 + (1 + sr7 | sub_id),
    family = "binomial", data = .
  )

test_sr8_all <- test_dataset_all %>%
  glmer(
    choice ~ mem_value + sr8 + (1 + sr8 | sub_id),
    family = "binomial", data = .
  )

test_sr9_all <- test_dataset_all %>%
  glmer(
    choice ~ mem_value + sr9 + (1 + sr9 | sub_id),
    family = "binomial", data = .
  )
```

Moving on to the "advisees observe advisors' advisors" models.

```{r test-obs-out}
test_mem_out <- test_dataset_out %>%
  glmer(
    choice ~ mem_value + (1 + mem_value | sub_id),
    family = "binomial", data = .
  )

test_triad_out <- test_dataset_out %>%
  glmer(
    choice ~ mem_value + triad_value + (1 + triad_value | sub_id),
    family = "binomial", data = .
  )

test_comm_out <- test_dataset_out %>%
  glmer(
    choice ~ mem_value + comm_value + (1 + comm_value | sub_id),
    family = "binomial", data = .
  )

test_sr1_out <- test_dataset_out %>%
  glmer(
    choice ~ mem_value + sr1 + (1 + sr1 | sub_id),
    family = "binomial", data = .
  )

test_sr2_out <- test_dataset_out %>%
  glmer(
    choice ~ mem_value + sr2 + (1 + sr2 | sub_id),
    family = "binomial", data = .
  )

test_sr3_out <- test_dataset_out %>%
  glmer(
    choice ~ mem_value + sr3 + (1 + sr3 | sub_id),
    family = "binomial", data = .
  )

test_sr4_out <- test_dataset_out %>%
  glmer(
    choice ~ mem_value + sr4 + (1 + sr4 | sub_id),
    family = "binomial", data = .
  )

test_sr5_out <- test_dataset_out %>%
  glmer(
    choice ~ mem_value + sr5 + (1 + sr5 | sub_id),
    family = "binomial", data = .
  )

test_sr6_out <- test_dataset_out %>%
  glmer(
    choice ~ mem_value + sr6 + (1 + sr6 | sub_id),
    family = "binomial", data = .
  )

test_sr7_out <- test_dataset_out %>%
  glmer(
    choice ~ mem_value + sr7 + (1 + sr7 | sub_id),
    family = "binomial", data = .
  )

test_sr8_out <- test_dataset_out %>%
  glmer(
    choice ~ mem_value + sr8 + (1 + sr8 | sub_id),
    family = "binomial", data = .
  )

test_sr9_out <- test_dataset_out %>%
  glmer(
    choice ~ mem_value + sr9 + (1 + sr9 | sub_id),
    family = "binomial", data = .
  )
```

Finally, the "advisors observe advisees' advisees" models. Once again, we can't estimate a model for community detection, as this strategy predicts that everyone is connected to everyone.

```{r test-obs-in}
test_mem_in <- test_dataset_in %>%
  glmer(
    choice ~ mem_value + (1 + mem_value | sub_id),
    family = "binomial", data = .
  )

test_triad_in <- test_dataset_in %>%
  glmer(
    choice ~ mem_value + triad_value + (1 + triad_value | sub_id),
    family = "binomial", data = .
  )

# test_comm_in <- test_dataset_in %>%
#   glmer(
#     choice ~ mem_value + comm_value + (1 + comm_value | sub_id),
#     family = "binomial", data = .
#   )

test_sr1_in <- test_dataset_in %>%
  glmer(
    choice ~ mem_value + sr1 + (1 + sr1 | sub_id),
    family = "binomial", data = .
  )

test_sr2_in <- test_dataset_in %>%
  glmer(
    choice ~ mem_value + sr2 + (1 + sr2 | sub_id),
    family = "binomial", data = .
  )

test_sr3_in <- test_dataset_in %>%
  glmer(
    choice ~ mem_value + sr3 + (1 + sr3 | sub_id),
    family = "binomial", data = .
  )

test_sr4_in <- test_dataset_in %>%
  glmer(
    choice ~ mem_value + sr4 + (1 + sr4 | sub_id),
    family = "binomial", data = .
  )

test_sr5_in <- test_dataset_in %>%
  glmer(
    choice ~ mem_value + sr5 + (1 + sr5 | sub_id),
    family = "binomial", data = .
  )

test_sr6_in <- test_dataset_in %>%
  glmer(
    choice ~ mem_value + sr6 + (1 + sr6 | sub_id),
    family = "binomial", data = .
  )

test_sr7_in <- test_dataset_in %>%
  glmer(
    choice ~ mem_value + sr7 + (1 + sr7 | sub_id),
    family = "binomial", data = .
  )

test_sr8_in <- test_dataset_in %>%
  glmer(
    choice ~ mem_value + sr8 + (1 + sr8 | sub_id),
    family = "binomial", data = .
  )

test_sr9_in <- test_dataset_in %>%
  glmer(
    choice ~ mem_value + sr9 + (1 + sr9 | sub_id),
    family = "binomial", data = .
  )
```

Of all of these models, which has the "best" fit, based on BIC?

```{r goodness-of-fit}
# Which has the best "goodness-of-fit"?
goodness_of_fit <- bind_rows(
  # All: goes both ways
  glance(test_mem_all) %>% mutate(model = "mem_all"),
  glance(test_triad_all) %>% mutate(model = "triad_all"),
  glance(test_sr1_all) %>% mutate(model = "sr1_all"),
  glance(test_sr2_all) %>% mutate(model = "sr2_all"),
  glance(test_sr3_all) %>% mutate(model = "sr3_all"),
  glance(test_sr4_all) %>% mutate(model = "sr4_all"),
  glance(test_sr5_all) %>% mutate(model = "sr5_all"),
  glance(test_sr6_all) %>% mutate(model = "sr6_all"),
  glance(test_sr7_all) %>% mutate(model = "sr7_all"),
  glance(test_sr8_all) %>% mutate(model = "sr8_all"),
  glance(test_sr9_all) %>% mutate(model = "sr9_all"),
  # Out: advisors of advisors
  glance(test_mem_out) %>% mutate(model = "mem_out"),
  glance(test_triad_out) %>% mutate(model = "triad_out"),
  glance(test_comm_out) %>% mutate(model = "comm_out"),
  glance(test_sr1_out) %>% mutate(model = "sr1_out"),
  glance(test_sr2_out) %>% mutate(model = "sr2_out"),
  glance(test_sr3_out) %>% mutate(model = "sr3_out"),
  glance(test_sr4_out) %>% mutate(model = "sr4_out"),
  glance(test_sr5_out) %>% mutate(model = "sr5_out"),
  glance(test_sr6_out) %>% mutate(model = "sr6_out"),
  glance(test_sr7_out) %>% mutate(model = "sr7_out"),
  glance(test_sr8_out) %>% mutate(model = "sr8_out"),
  glance(test_sr9_out) %>% mutate(model = "sr9_out"),
  # In: advisees of advisees
  glance(test_mem_in) %>% mutate(model = "mem_in"),
  glance(test_triad_in) %>% mutate(model = "triad_in"),
  glance(test_sr1_in) %>% mutate(model = "sr1_in"),
  glance(test_sr2_in) %>% mutate(model = "sr2_in"),
  glance(test_sr3_in) %>% mutate(model = "sr3_in"),
  glance(test_sr4_in) %>% mutate(model = "sr4_in"),
  glance(test_sr5_in) %>% mutate(model = "sr5_in"),
  glance(test_sr6_in) %>% mutate(model = "sr6_in"),
  glance(test_sr7_in) %>% mutate(model = "sr7_in"),
  glance(test_sr8_in) %>% mutate(model = "sr8_in"),
  glance(test_sr9_in) %>% mutate(model = "sr9_in"),
) %>%
  select(model, everything()) %>%
  arrange(BIC) %>%
  # Make this a little more human-readable
  separate(model, into = c("model", "direction")) %>%
  mutate(
    direction = case_when(
      direction == "out" ~ "advisee observes advisor of advisor",
      direction == "in" ~ "advisor observes advisee of advisee",
      direction == "all" ~ "both ways"
    )
  )

goodness_of_fit %>%
  mutate(
    model = fct_relevel(
      model,
      "sr1", "sr2", "sr3", "sr4", "sr5", "sr6", "sr7", "sr8", "sr9",
      "mem", "triad", "comm"
    )
  ) %>%
  ggplot(aes(x = model, y = BIC, color = direction)) +
  gg +
  geom_point() +
  ggtitle("Model goodness-of-fit") +
  theme(legend.position = "bottom")

goodness_of_fit %>%
  kable()
```

Let's take a look at the best-fitting model estimates:

```{r best-model}
test_sr7_out %>%
  tidy(conf.int = TRUE) %>%
  kable()

vif(test_sr7_out)
```



# SR centrality

Can the best-fitting SR explain an individual's perceptions of other network members' centrality?

```{r def-sr-centrality}
sr_centrality <- sr_out %>%
  # SR centrality from best variant
  filter(from != to) %>%
  group_by(sub_id, node_id = to) %>%
  summarise(sr_centrality = mean(sr7), .groups = "drop") %>%
  # Perceived centrality
  left_join(
    krackhardt_allocentric %>%
      group_by(sub_id, node_id = to) %>%
      summarise(perceived_centrality = mean(edge), .groups = "drop"),
    by = c("sub_id", "node_id")
  )
```

To answer this, we'll do a random-effects analysis where we take each individual's idiosyncratic mental representation of their network, then rank-correlate it with the SR's individual-level predictions.

```{r test-sr-centrality}
sr_centrality_corr <- sr_centrality %>%
  group_by(sub_id) %>%
  nest() %>%
  mutate(
    spearman = map_dbl(
      .x = data,
      .f = ~with(
        .x, cor(sr_centrality, perceived_centrality, method = "spearman")
      )
    )
  ) %>%
  ungroup() %>%
  select(-data)

sr_centrality_corr %>%
  mutate(spearman = atanh(spearman)) %>%
  select(spearman) %>%
  deframe() %>%
  t.test() %>%
  tidy() %>%
  kable(
    caption = str_c(
      "t-test of the average Spearman correlation between perceived and ",
      "SR-predicted centrality (z-transformed)"
    )
  )

sr_centrality_corr %>%
  summarise(mean_spearman = mean(spearman)) %>%
  kable()
```

Let's try to visualize these results in an intuitive format.

```{r plot-sr-centrality}
plot_sr_centrality_group <- sr_centrality %>%
  # Get each subject's rank-ordered perceived and SR-predicted centrality
  group_by(sub_id) %>%
  mutate(across(ends_with("_centrality"), ~rank(.x, ties.method = "min"))) %>%
  ungroup() %>%
  # Average across subjects to get group-level results
  group_by(node_id) %>%
  summarise(across(ends_with("_centrality"), mean)) %>%
  # For color-coding two pairs of managers with equivalent degree centrality,
  # but different perceived centrality
  mutate(
    colorcode = case_when(
      node_id %in% c(1, 7) ~ "a",
      node_id %in% c(18, 21) ~ "b"
    )
  ) %>%
  ggplot(aes(x=perceived_centrality, y=sr_centrality)) +
  gg +
  geom_smooth(method = "lm", se = FALSE, color = "black") +
  geom_label(aes(label = node_id, fill = colorcode), show.legend = FALSE) +
  scale_fill_manual(values = c("#a6cee3", "#b2df8a"), na.value = "grey95") +
  scale_x_continuous(name = "Perceived centrality (rank-ordered)") +
  scale_y_continuous(name = "SR centrality (rank-ordered)") +
  ggtitle("Perceived vs Successor Centrality (Group-Level)")

plot_sr_centrality_group

plot_sr_centrality_individual <- sr_centrality %>%
  # Get each subject's rank-ordered perceived and SR-predicted centrality
  group_by(sub_id) %>%
  mutate(across(ends_with("_centrality"), ~rank(.x, ties.method = "min"))) %>%
  ungroup() %>%
  # Formatting for plotting
  pivot_longer(
    ends_with("_centrality"),
    names_to = "metric",
    values_to = "centrality"
  ) %>%
  mutate(
    node_id = factor(node_id),
    sub_id = str_pad(sub_id, width = 2, side = "left", pad = "0"),
    sub_id = str_c("Manager ", sub_id)
  ) %>%
  ggplot(aes(x=node_id, y=centrality, fill=metric)) +
  facet_wrap(~sub_id, nrow = 3) +
  geom_col(position = "identity", color = "black", alpha = 0.5) +
  coord_polar() +
  scale_fill_viridis_d(
    name = "Centrality",
    labels = c("Perceived (Empirical)", "Successor Representation (Simulated)")
  ) +
  ggtitle("Perceived vs Successor Centrality (Individual-Level)") +
  theme_bw() +
  theme(
    plot.title = element_text(hjust = 0.5, size = 13),
    plot.subtitle = element_text(hjust = 0.5, size = 11),
    panel.grid.minor = element_blank(),
    panel.spacing = unit(0.75, "lines"),
    legend.box.spacing = unit(0.5, "lines"),
    legend.margin = margin(c(0, 0, 0, 0), unit = "lines"),
    legend.position = "bottom",
    axis.title.x = element_blank(),
    axis.title.y = element_blank(),
    axis.text.y = element_blank(),
    axis.ticks.y = element_blank()
  )

plot_sr_centrality_individual

if (knitting) {
  ggsave(
    here("outputs", "07-krackhardt", "sr_centrality_individual_standard.pdf"),
    plot = plot_sr_centrality_individual,
    width = 12, height = 8,
    device = cairo_pdf, units = "in", dpi = 300
  )
}
```

We're particularly interested in the manager pairs 1-7 and 18-21. On an individual-by-individual basis, does the model correctly predict, in each pair, which manager is represented as being more central? We'll do a prevalence test to see whether the model made correct predictions for significantly greater than 50% of subjects.

```{r prevalence-test}
sr_centrality %>%
  # Find and retain only relevant node-pairs
  mutate(
    pair_id = case_when(
      node_id %in% c(1, 7) ~ "1-7",
      node_id %in% c(18, 21) ~ "18-21"
    )
  ) %>%
  drop_na() %>%
  # Whichever manager is represented as being more central in each node-pair,
  # does the model make a correct prediction?
  group_by(sub_id, pair_id) %>%
  mutate(
    across(
      c(sr_centrality, perceived_centrality),
      ~.x - first(.x)
    )
  ) %>%
  filter(node_id %in% c(7, 21)) %>%
  summarise(
    correct = sr_centrality * perceived_centrality > 0,
    .groups = "drop"
  ) %>%
  # Tally correct predictions
  count(pair_id, correct) %>%
  # Prevalence test
  filter(correct) %>%
  split(.$pair_id) %>%
  map_dfr(
    .f = ~.x %>%
      select(n) %>%
      deframe() %>%
      binom.test(x = ., n = 21, p = 0.5, alternative = "greater") %>%
      tidy(),
    .id = "pair_id"
  ) %>%
  kable(caption = "Prevalence test (> 50%)")
```

We'd previously noted that betweenness seems to track discrepancies between a manager's true versus perceived centrality. Does SR centrality track with betweenness? The loess curve suggests the relationship is nonlinear, but there is a strongly significant rank correlation between the two.

```{r test-sr-betweenness}
sr_centrality %>%
  group_by(name = node_id) %>%
  summarise(sr_centrality = mean(sr_centrality)) %>%
  left_join(centrality_differences) %>%
  ggplot(aes(x=sr_centrality, y=Betweenness)) +
  gg +
  geom_smooth(method = "loess", se = FALSE) +
  geom_label(aes(label = name)) +
  xlab("Successor Representation \u03B3 = 0.7")

if (knitting) {
  ggsave(
    here("outputs", "07-krackhardt", "centrality_sr_vs_betweenness.pdf"),
    width = 6, height = 4,
    device = cairo_pdf, units = "in", dpi = 300
  )
}

sr_centrality %>%
  group_by(name = node_id) %>%
  summarise(sr_centrality = mean(sr_centrality)) %>%
  left_join(centrality_differences) %>%
  with(cor.test(Betweenness, sr_centrality, method = "spearman", exact = FALSE)) %>%
  tidy() %>%
  kable(caption = "Betweenness vs SR gamma 0.7")
```


# Company leadership

So far, we have not provided the model with any privileged knowledge about the company's senior leadership hierarchy. Yet, using observations constrained by the network structure, the model has identified most of the company's leadership anyways.

The CEO is coded as node 7, and the four Vice Presidents are coded as 2, 14, 18, and 21. We see clear evidence that the model identifies nodes 2, 7, 18, and 21 as being central figures, but the model makes conspicuously incorrect predictions about manager 14.

Could the model in principle do a better job at identifying manager 14 as a central network member? To test this, we'll make one additional assumption about managers' observations: they observe the senior leadership twice as much as other network members.

```{r leadership-compute-sr}
sr_observations_leadership <- obs_edgelist_out %>%
  # Double-count CEO and VPs
  bind_rows(
    obs_edgelist_out %>%
      filter(from %in% c(2, 7, 14, 18, 21) | to %in% c(2, 7, 14, 18, 21))
  ) %>%
  expand_grid(rep = 1:100) %>%
  group_by(sub_id, rep) %>%
  slice_sample(prop = 1) %>%
  ungroup()

sr_leadership <- sr_observations_leadership %>%
  expand_grid(gamma = seq(0.1, 0.9, 0.1), .) %>%
  group_by(sub_id, gamma) %>%
  nest() %>%
  mutate(
    sr = map2(
      .x = data, .y = gamma,
      ~build_rep_sr(.x, this_alpha = 0.1, this_gamma = .y)
    )
  ) %>%
  unnest(sr) %>%
  ungroup() %>%
  select(-data) %>%
  mutate(gamma = gamma * 10) %>%
  pivot_wider(
    names_from = gamma,
    names_prefix = "sr",
    values_from = sr_value
  )
```

We'll test whether the model goodness-of-fit is better at $\gamma = 0.7$, which was the best-fitting model from before:

```{r leadership-bic}
test_sr7_leadership <- krackhardt_allocentric %>%
  rename(choice = edge) %>%
  left_join(schema_experts_out, by = c("sub_id", "from", "to")) %>%
  left_join(sr_leadership, by = c("sub_id", "from", "to")) %>%
  glmer(
    choice ~ mem_value + sr7 + (1 + sr7 | sub_id),
    family = "binomial", data = .
  )

bind_rows(
  glance(test_sr7_out) %>% mutate(model = "sr7 standard"),
  glance(test_sr7_leadership) %>% mutate(model = "sr7 leadership 2x")
) %>%
  kable()
```

```{r leadership-centrality}
sr_centrality_leadership <- sr_leadership %>%
  # SR centrality from best variant
  filter(from != to) %>%
  group_by(sub_id, node_id = to) %>%
  summarise(sr_centrality = mean(sr7), .groups = "drop") %>%
  # Perceived centrality
  left_join(
    krackhardt_allocentric %>%
      group_by(sub_id, node_id = to) %>%
      summarise(perceived_centrality = mean(edge), .groups = "drop"),
    by = c("sub_id", "node_id")
  )

sr_centrality_corr_leadership <- sr_centrality_leadership %>%
  group_by(sub_id) %>%
  nest() %>%
  mutate(
    spearman = map_dbl(
      .x = data,
      .f = ~with(
        .x, cor(
          sr_centrality, perceived_centrality, method = "spearman"
        )
      )
    )
  ) %>%
  ungroup() %>%
  select(-data)

sr_centrality_corr_leadership %>%
  mutate(spearman = atanh(spearman)) %>%
  select(spearman) %>%
  deframe() %>%
  t.test() %>%
  tidy() %>%
  kable(
    caption = str_c(
      "t-test of the average Spearman correlation between perceived and ",
      "SR-predicted centrality (z-transformed)"
    )
  )

sr_centrality_corr_leadership %>%
  summarise(mean_spearman = mean(spearman)) %>%
  kable()
```

```{r leadership-plot-centrality}
plot_sr_centrality_group_leadership <- sr_centrality_leadership %>%
  # Get each subject's rank-ordered perceived and SR-predicted centrality
  group_by(sub_id) %>%
  mutate(across(ends_with("_centrality"), ~rank(.x, ties.method = "min"))) %>%
  ungroup() %>%
  # Average across subjects to get group-level results
  group_by(node_id) %>%
  summarise(across(ends_with("_centrality"), mean)) %>%
  # For color-coding two pairs of managers with equivalent degree centrality,
  # but different perceived centrality
  mutate(
    colorcode = case_when(
      node_id %in% c(1, 7) ~ "a",
      node_id %in% c(18, 21) ~ "b"
    )
  ) %>%
  ggplot(aes(x=perceived_centrality, y=sr_centrality)) +
  gg +
  geom_smooth(method = "lm", se = FALSE, color = "black") +
  geom_label(aes(label = node_id, fill = colorcode), show.legend = FALSE) +
  scale_fill_manual(values = c("#a6cee3", "#b2df8a"), na.value = "grey95") +
  scale_x_continuous(name = "Perceived centrality (rank-ordered)") +
  scale_y_continuous(name = "SR centrality (rank-ordered)") +
  ggtitle("Perceived vs Successor Centrality (Group-Level)")

plot_sr_centrality_group_leadership

plot_sr_centrality_individual_leadership <- sr_centrality_leadership %>%
  # Get each subject's rank-ordered perceived and SR-predicted centrality
  group_by(sub_id) %>%
  mutate(across(ends_with("_centrality"), ~rank(.x, ties.method = "min"))) %>%
  ungroup() %>%
  # Formatting for plotting
  pivot_longer(
    ends_with("_centrality"),
    names_to = "metric",
    values_to = "centrality"
  ) %>%
  mutate(
    node_id = factor(node_id),
    sub_id = str_pad(sub_id, width = 2, side = "left", pad = "0"),
    sub_id = str_c("Manager ", sub_id)
  ) %>%
  ggplot(aes(x=node_id, y=centrality, fill=metric)) +
  facet_wrap(~sub_id, nrow = 3) +
  geom_col(position = "identity", color = "black", alpha = 0.5) +
  coord_polar() +
  scale_fill_viridis_d(
    name = "Centrality",
    labels = c("Perceived (Empirical)", "Successor Representation (Simulated)")
  ) +
  ggtitle("Perceived vs Successor Centrality (Individual-Level)") +
  theme_bw() +
  theme(
    plot.title = element_text(hjust = 0.5, size = 13),
    plot.subtitle = element_text(hjust = 0.5, size = 11),
    panel.grid.minor = element_blank(),
    panel.spacing = unit(0.75, "lines"),
    legend.box.spacing = unit(0.5, "lines"),
    legend.margin = margin(c(0, 0, 0, 0), unit = "lines"),
    legend.position = "bottom",
    axis.title.x = element_blank(),
    axis.title.y = element_blank(),
    axis.text.y = element_blank(),
    axis.ticks.y = element_blank()
  )

plot_sr_centrality_individual_leadership

if (knitting) {
  ggsave(
    here("outputs", "07-krackhardt", "sr_centrality_group_leadership.pdf"),
    plot = plot_sr_centrality_group_leadership,
    width = 6, height = 4,
    device = cairo_pdf, units = "in", dpi = 300
  )
  
  ggsave(
    here("outputs", "07-krackhardt", "sr_centrality_individual_leadership.pdf"),
    plot = plot_sr_centrality_individual_leadership,
    width = 12, height = 8,
    device = cairo_pdf, units = "in", dpi = 300
  )
}
```

# Figures for paper

```{r paper-network-graph}
network_for_paper <- krackhardt_g %>%
  mutate(
    Degree = centrality_degree(mode = "in"),
    node_colorcode = case_when(
      name %in% c(1, 7) ~ "a",
      name %in% c(18, 21) ~ "b"
    )
  ) %>%
  # To clean up some of the overplotting
  activate("edges") %>%
  mutate(
    edge_colorcode = case_when(
      from %in% c(1, 7) | to %in% c(1, 7) ~ "a",
      from %in% c(18, 21) | to %in% c(18, 21) ~ "b"
    )
  ) %>%
  # Plot the "ground truth" network
  ggraph(layout = "stress") +
  geom_edge_link(
    aes(
      start_cap = label_rect(node1.name, padding = margin(15, 15, 15, 15)),
      end_cap = label_rect(node2.name, padding = margin(15, 15, 15, 15)),
      color = edge_colorcode
    ),
    arrow = arrow(length = unit(5, "pt"), type = "closed"),
    show.legend = FALSE
  ) +
  geom_node_label(
    aes(label = name, size = Degree, fill = node_colorcode),
    show.legend = FALSE
  ) +
  scale_size(range = c(3, 10)) +
  scale_fill_manual(values = c("#a6cee3", "#b2df8a"), na.value = "grey95") +
  scale_edge_color_manual(
    values = c("#1f78b4", "#33a02c"), na.value = "grey80"
  ) +
  ggtitle("Advice Network (Krackhardt 1987)") +
  theme(
    panel.background = element_blank(),
    strip.background = element_blank(),
    panel.border = element_blank(),
    axis.title = element_blank(),
    axis.text = element_blank(),
    axis.ticks = element_blank(),
    plot.title = element_text(hjust = 0.5, size = 13),
    strip.text = element_text(hjust = 0.5, size = 13),
    legend.position = "bottom"
  )

network_for_paper

if (knitting) {
  ggsave(
    here("outputs", "07-krackhardt", "network_graph.pdf"),
    plot = network_for_paper,
    width = 12, height = 5,
    device = cairo_pdf, units = "in", dpi = 300
  )
}
```

```{r paper-representation}
representation_for_paper <- (
  (plot_choice + ggtitle("Perceived (Empirical)")) |
    plot_sr7_out + ggtitle("Successor Rep. \u03B3 = 0.7 (Simulated)")
) +
  plot_annotation(
    title = "Network Representation",
    theme = theme(plot.title = element_text(size = 13, hjust = 0.5))
  )

representation_for_paper

if (knitting) {
  ggsave(
    here("outputs", "07-krackhardt", "representation.pdf"),
    plot = representation_for_paper,
    width = 12, height = 5,
    device = cairo_pdf, units = "in", dpi = 300
  )
}
```

```{r paper-centrality}
centrality_individual_for_paper <- sr_centrality %>%
  filter(sub_id %in% c(17, 19)) %>%
  # Get each subject's rank-ordered perceived and SR-predicted centrality
  group_by(sub_id) %>%
  mutate(across(ends_with("_centrality"), ~rank(.x, ties.method = "min"))) %>%
  ungroup() %>%
  # Formatting for plotting
  pivot_longer(
    ends_with("_centrality"),
    names_to = "metric",
    values_to = "centrality"
  ) %>%
  mutate(
    node_id = factor(node_id),
    sub_id = str_pad(sub_id, width = 2, side = "left", pad = "0"),
    sub_id = str_c("Manager ", sub_id)
  ) %>%
  ggplot(aes(x=node_id, y=centrality, fill=metric)) +
  facet_wrap(~sub_id) +
  geom_col(position = "identity", color = "black", alpha = 0.5) +
  coord_polar() +
  scale_fill_viridis_d(
    name = "Centrality",
    labels = c("Perceived (Empirical)", "Successor Rep. (Simulated)")
  ) +
  ggtitle("Individual-Level") +
  theme_bw() +
  theme(
    plot.title = element_text(hjust = 0.5, size = 13),
    plot.subtitle = element_text(hjust = 0.5, size = 11),
    panel.grid.minor = element_blank(),
    panel.spacing = unit(0.75, "lines"),
    legend.box.spacing = unit(0.5, "lines"),
    legend.margin = margin(c(0, 0, 0, 0), unit = "lines"),
    legend.position = "bottom",
    axis.title.x = element_blank(),
    axis.title.y = element_blank(),
    axis.text.y = element_blank(),
    axis.ticks.y = element_blank()
  )

centrality_for_paper <- (
  (
    plot_sr_centrality_group +
      ggtitle("Group-Level") +
      scale_y_continuous(name = "Successor Rep. centrality (rank-ordered)")
  ) |
    centrality_individual_for_paper
) +
  plot_annotation(
    title = "Perceived vs Successor Centrality",
    theme = theme(plot.title = element_text(size = 13, hjust = 0.5))
  )

centrality_for_paper

if (knitting) {
  ggsave(
    here("outputs", "07-krackhardt", "centrality.pdf"),
    plot = centrality_for_paper,
    width = 12, height = 5,
    device = cairo_pdf, units = "in", dpi = 300
  )
}
```


# Save data

It was a bit of a pain getting the raw data into shape, so we'll save the cleaned versions for posterity.

```{r save-data}
if (knitting) {
  krackhardt_egocentric %>%
    write_csv(here("data", "krackhardt", "clean_egocentric.csv"))
  
  krackhardt_allocentric %>%
    rename(choice = edge) %>%
    write_csv(here("data", "krackhardt", "clean_allocentric.csv"))
}
```


# Session info

```{r session-info}
sessionInfo()
```

